diff --git a/docs/_posts/ahmedlone127/2024-12-15-12_shot_sta_head_skhead_en.md b/docs/_posts/ahmedlone127/2024-12-15-12_shot_sta_head_skhead_en.md new file mode 100644 index 00000000000000..61cfdebfb1c1dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-12_shot_sta_head_skhead_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English 12_shot_sta_head_skhead MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: 12_shot_sta_head_skhead +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`12_shot_sta_head_skhead` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/12_shot_sta_head_skhead_en_5.5.1_3.0_1734306650214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/12_shot_sta_head_skhead_en_5.5.1_3.0_1734306650214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("12_shot_sta_head_skhead","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("12_shot_sta_head_skhead","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|12_shot_sta_head_skhead| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/Nhat1904/12_shot_STA_head_skhead \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-12_shot_sta_head_skhead_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-12_shot_sta_head_skhead_pipeline_en.md new file mode 100644 index 00000000000000..4182a0bbac97a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-12_shot_sta_head_skhead_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English 12_shot_sta_head_skhead_pipeline pipeline MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: 12_shot_sta_head_skhead_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`12_shot_sta_head_skhead_pipeline` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/12_shot_sta_head_skhead_pipeline_en_5.5.1_3.0_1734306672117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/12_shot_sta_head_skhead_pipeline_en_5.5.1_3.0_1734306672117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("12_shot_sta_head_skhead_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("12_shot_sta_head_skhead_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|12_shot_sta_head_skhead_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/Nhat1904/12_shot_STA_head_skhead + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-32_shot_twitter_en.md b/docs/_posts/ahmedlone127/2024-12-15-32_shot_twitter_en.md new file mode 100644 index 00000000000000..2306d90fa327c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-32_shot_twitter_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English 32_shot_twitter MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: 32_shot_twitter +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`32_shot_twitter` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/32_shot_twitter_en_5.5.1_3.0_1734306891116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/32_shot_twitter_en_5.5.1_3.0_1734306891116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("32_shot_twitter","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("32_shot_twitter","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|32_shot_twitter| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Nhat1904/32-shot-twitter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-32_shot_twitter_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-32_shot_twitter_pipeline_en.md new file mode 100644 index 00000000000000..9ae4e403402bdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-32_shot_twitter_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English 32_shot_twitter_pipeline pipeline MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: 32_shot_twitter_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`32_shot_twitter_pipeline` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/32_shot_twitter_pipeline_en_5.5.1_3.0_1734306912085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/32_shot_twitter_pipeline_en_5.5.1_3.0_1734306912085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("32_shot_twitter_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("32_shot_twitter_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|32_shot_twitter_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Nhat1904/32-shot-twitter + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v3_en.md b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v3_en.md new file mode 100644 index 00000000000000..9430b502fe8502 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v3_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English 579_stmodel_v3 MPNetEmbeddings from jamiehudson +author: John Snow Labs +name: 579_stmodel_v3 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`579_stmodel_v3` is a English model originally trained by jamiehudson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/579_stmodel_v3_en_5.5.1_3.0_1734306189311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/579_stmodel_v3_en_5.5.1_3.0_1734306189311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("579_stmodel_v3","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("579_stmodel_v3","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|579_stmodel_v3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jamiehudson/579-STmodel-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v3_pipeline_en.md new file mode 100644 index 00000000000000..e6467778429ee8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v3_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English 579_stmodel_v3_pipeline pipeline MPNetEmbeddings from jamiehudson +author: John Snow Labs +name: 579_stmodel_v3_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`579_stmodel_v3_pipeline` is a English model originally trained by jamiehudson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/579_stmodel_v3_pipeline_en_5.5.1_3.0_1734306211957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/579_stmodel_v3_pipeline_en_5.5.1_3.0_1734306211957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("579_stmodel_v3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("579_stmodel_v3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|579_stmodel_v3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/jamiehudson/579-STmodel-v3 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v5_en.md b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v5_en.md new file mode 100644 index 00000000000000..8702f5cfa43413 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v5_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English 579_stmodel_v5 MPNetEmbeddings from jamiehudson +author: John Snow Labs +name: 579_stmodel_v5 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`579_stmodel_v5` is a English model originally trained by jamiehudson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/579_stmodel_v5_en_5.5.1_3.0_1734306753325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/579_stmodel_v5_en_5.5.1_3.0_1734306753325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("579_stmodel_v5","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("579_stmodel_v5","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|579_stmodel_v5| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jamiehudson/579-STmodel-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v5_pipeline_en.md new file mode 100644 index 00000000000000..b10463689ab11b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-579_stmodel_v5_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English 579_stmodel_v5_pipeline pipeline MPNetEmbeddings from jamiehudson +author: John Snow Labs +name: 579_stmodel_v5_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`579_stmodel_v5_pipeline` is a English model originally trained by jamiehudson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/579_stmodel_v5_pipeline_en_5.5.1_3.0_1734306774828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/579_stmodel_v5_pipeline_en_5.5.1_3.0_1734306774828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("579_stmodel_v5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("579_stmodel_v5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|579_stmodel_v5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jamiehudson/579-STmodel-v5 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-adfler_albert_base_v2_en.md b/docs/_posts/ahmedlone127/2024-12-15-adfler_albert_base_v2_en.md new file mode 100644 index 00000000000000..baabdbe7101890 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-adfler_albert_base_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English adfler_albert_base_v2 AlbertForTokenClassification from swardiantara +author: John Snow Labs +name: adfler_albert_base_v2 +date: 2024-12-15 +tags: [en, open_source, onnx, token_classification, albert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: AlbertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained AlbertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adfler_albert_base_v2` is a English model originally trained by swardiantara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adfler_albert_base_v2_en_5.5.1_3.0_1734288809469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adfler_albert_base_v2_en_5.5.1_3.0_1734288809469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = AlbertForTokenClassification.pretrained("adfler_albert_base_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = AlbertForTokenClassification.pretrained("adfler_albert_base_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adfler_albert_base_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.0 MB| + +## References + +https://huggingface.co/swardiantara/ADFLER-albert-base-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-adfler_albert_base_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-adfler_albert_base_v2_pipeline_en.md new file mode 100644 index 00000000000000..933ca5a32a25e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-adfler_albert_base_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English adfler_albert_base_v2_pipeline pipeline AlbertForTokenClassification from swardiantara +author: John Snow Labs +name: adfler_albert_base_v2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained AlbertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adfler_albert_base_v2_pipeline` is a English model originally trained by swardiantara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adfler_albert_base_v2_pipeline_en_5.5.1_3.0_1734288811710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adfler_albert_base_v2_pipeline_en_5.5.1_3.0_1734288811710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("adfler_albert_base_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("adfler_albert_base_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adfler_albert_base_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|42.0 MB| + +## References + +https://huggingface.co/swardiantara/ADFLER-albert-base-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- AlbertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-adl_hw1_qa_model2_en.md b/docs/_posts/ahmedlone127/2024-12-15-adl_hw1_qa_model2_en.md new file mode 100644 index 00000000000000..8943b8e01ed8d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-adl_hw1_qa_model2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English adl_hw1_qa_model2 BertForQuestionAnswering from b09501048 +author: John Snow Labs +name: adl_hw1_qa_model2 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adl_hw1_qa_model2` is a English model originally trained by b09501048. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adl_hw1_qa_model2_en_5.5.1_3.0_1734296721335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adl_hw1_qa_model2_en_5.5.1_3.0_1734296721335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("adl_hw1_qa_model2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("adl_hw1_qa_model2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adl_hw1_qa_model2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/b09501048/adl_hw1_qa_model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-adl_hw1_qa_model2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-adl_hw1_qa_model2_pipeline_en.md new file mode 100644 index 00000000000000..ed04a1a4f0ba31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-adl_hw1_qa_model2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English adl_hw1_qa_model2_pipeline pipeline BertForQuestionAnswering from b09501048 +author: John Snow Labs +name: adl_hw1_qa_model2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adl_hw1_qa_model2_pipeline` is a English model originally trained by b09501048. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adl_hw1_qa_model2_pipeline_en_5.5.1_3.0_1734296741262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adl_hw1_qa_model2_pipeline_en_5.5.1_3.0_1734296741262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("adl_hw1_qa_model2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("adl_hw1_qa_model2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adl_hw1_qa_model2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/b09501048/adl_hw1_qa_model2 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_augmentation_indomain_bm25_sts_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_augmentation_indomain_bm25_sts_en.md new file mode 100644 index 00000000000000..4aa6e6d89ebc3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_augmentation_indomain_bm25_sts_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_augmentation_indomain_bm25_sts MPNetEmbeddings from armaniii +author: John Snow Labs +name: all_mpnet_base_v2_augmentation_indomain_bm25_sts +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_augmentation_indomain_bm25_sts` is a English model originally trained by armaniii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_augmentation_indomain_bm25_sts_en_5.5.1_3.0_1734306932281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_augmentation_indomain_bm25_sts_en_5.5.1_3.0_1734306932281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_augmentation_indomain_bm25_sts","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_augmentation_indomain_bm25_sts","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_augmentation_indomain_bm25_sts| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/armaniii/all-mpnet-base-v2-augmentation-indomain-bm25-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline_en.md new file mode 100644 index 00000000000000..8e0fbf82b8b657 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline pipeline MPNetEmbeddings from armaniii +author: John Snow Labs +name: all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline` is a English model originally trained by armaniii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline_en_5.5.1_3.0_1734306953770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline_en_5.5.1_3.0_1734306953770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_augmentation_indomain_bm25_sts_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/armaniii/all-mpnet-base-v2-augmentation-indomain-bm25-sts + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_bioasq_1epoc_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_bioasq_1epoc_en.md new file mode 100644 index 00000000000000..fde5b52afcf547 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_bioasq_1epoc_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_bioasq_1epoc MPNetEmbeddings from juanpablomesa +author: John Snow Labs +name: all_mpnet_base_v2_bioasq_1epoc +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_bioasq_1epoc` is a English model originally trained by juanpablomesa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_bioasq_1epoc_en_5.5.1_3.0_1734305792595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_bioasq_1epoc_en_5.5.1_3.0_1734305792595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_bioasq_1epoc","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_bioasq_1epoc","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_bioasq_1epoc| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/juanpablomesa/all-mpnet-base-v2-bioasq-1epoc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_bioasq_1epoc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_bioasq_1epoc_pipeline_en.md new file mode 100644 index 00000000000000..cfa7bc14fb5fc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_bioasq_1epoc_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_bioasq_1epoc_pipeline pipeline MPNetEmbeddings from juanpablomesa +author: John Snow Labs +name: all_mpnet_base_v2_bioasq_1epoc_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_bioasq_1epoc_pipeline` is a English model originally trained by juanpablomesa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_bioasq_1epoc_pipeline_en_5.5.1_3.0_1734305813702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_bioasq_1epoc_pipeline_en_5.5.1_3.0_1734305813702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_bioasq_1epoc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_bioasq_1epoc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_bioasq_1epoc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/juanpablomesa/all-mpnet-base-v2-bioasq-1epoc + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_celanese_test_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_celanese_test_en.md new file mode 100644 index 00000000000000..2e11f8c3ec78b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_celanese_test_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_celanese_test MPNetEmbeddings from testCelUR +author: John Snow Labs +name: all_mpnet_base_v2_celanese_test +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_celanese_test` is a English model originally trained by testCelUR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_celanese_test_en_5.5.1_3.0_1734306481153.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_celanese_test_en_5.5.1_3.0_1734306481153.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_celanese_test","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_celanese_test","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_celanese_test| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/testCelUR/all-mpnet-base-v2-celanese_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_celanese_test_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_celanese_test_pipeline_en.md new file mode 100644 index 00000000000000..badb070745db9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_celanese_test_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_celanese_test_pipeline pipeline MPNetEmbeddings from testCelUR +author: John Snow Labs +name: all_mpnet_base_v2_celanese_test_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_celanese_test_pipeline` is a English model originally trained by testCelUR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_celanese_test_pipeline_en_5.5.1_3.0_1734306503656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_celanese_test_pipeline_en_5.5.1_3.0_1734306503656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_celanese_test_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_celanese_test_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_celanese_test_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/testCelUR/all-mpnet-base-v2-celanese_test + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_fine_tuned_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_fine_tuned_en.md new file mode 100644 index 00000000000000..199705d8b5dd96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_fine_tuned_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_fine_tuned MPNetEmbeddings from AhmetAytar +author: John Snow Labs +name: all_mpnet_base_v2_fine_tuned +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_fine_tuned` is a English model originally trained by AhmetAytar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_fine_tuned_en_5.5.1_3.0_1734306209971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_fine_tuned_en_5.5.1_3.0_1734306209971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_fine_tuned","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_fine_tuned","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_fine_tuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AhmetAytar/all-mpnet-base-v2-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_fine_tuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_fine_tuned_pipeline_en.md new file mode 100644 index 00000000000000..94f54ce11db501 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_fine_tuned_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_fine_tuned_pipeline pipeline MPNetEmbeddings from AhmetAytar +author: John Snow Labs +name: all_mpnet_base_v2_fine_tuned_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_fine_tuned_pipeline` is a English model originally trained by AhmetAytar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_fine_tuned_pipeline_en_5.5.1_3.0_1734306238717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_fine_tuned_pipeline_en_5.5.1_3.0_1734306238717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_fine_tuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_fine_tuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_fine_tuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AhmetAytar/all-mpnet-base-v2-fine-tuned + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_incident_similarity_tuned_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_incident_similarity_tuned_en.md new file mode 100644 index 00000000000000..00b39ec7801063 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_incident_similarity_tuned_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_incident_similarity_tuned MPNetEmbeddings from yudude +author: John Snow Labs +name: all_mpnet_base_v2_incident_similarity_tuned +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_incident_similarity_tuned` is a English model originally trained by yudude. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_incident_similarity_tuned_en_5.5.1_3.0_1734306687101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_incident_similarity_tuned_en_5.5.1_3.0_1734306687101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_incident_similarity_tuned","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_incident_similarity_tuned","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_incident_similarity_tuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/yudude/all-mpnet-base-v2-incident-similarity-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_incident_similarity_tuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_incident_similarity_tuned_pipeline_en.md new file mode 100644 index 00000000000000..7351dea1cce64c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_incident_similarity_tuned_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_incident_similarity_tuned_pipeline pipeline MPNetEmbeddings from yudude +author: John Snow Labs +name: all_mpnet_base_v2_incident_similarity_tuned_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_incident_similarity_tuned_pipeline` is a English model originally trained by yudude. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_incident_similarity_tuned_pipeline_en_5.5.1_3.0_1734306712180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_incident_similarity_tuned_pipeline_en_5.5.1_3.0_1734306712180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_incident_similarity_tuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_incident_similarity_tuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_incident_similarity_tuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/yudude/all-mpnet-base-v2-incident-similarity-tuned + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_unfair_tos_rationale_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_unfair_tos_rationale_en.md new file mode 100644 index 00000000000000..05528e6cebf51f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_unfair_tos_rationale_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_unfair_tos_rationale MPNetEmbeddings from cruzlorite +author: John Snow Labs +name: all_mpnet_base_v2_unfair_tos_rationale +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_unfair_tos_rationale` is a English model originally trained by cruzlorite. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_unfair_tos_rationale_en_5.5.1_3.0_1734306316031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_unfair_tos_rationale_en_5.5.1_3.0_1734306316031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_unfair_tos_rationale","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_unfair_tos_rationale","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_unfair_tos_rationale| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/cruzlorite/all-mpnet-base-v2-unfair-tos-rationale \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_unfair_tos_rationale_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_unfair_tos_rationale_pipeline_en.md new file mode 100644 index 00000000000000..567740db0f5668 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-all_mpnet_base_v2_unfair_tos_rationale_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_unfair_tos_rationale_pipeline pipeline MPNetEmbeddings from cruzlorite +author: John Snow Labs +name: all_mpnet_base_v2_unfair_tos_rationale_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_unfair_tos_rationale_pipeline` is a English model originally trained by cruzlorite. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_unfair_tos_rationale_pipeline_en_5.5.1_3.0_1734306337214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_unfair_tos_rationale_pipeline_en_5.5.1_3.0_1734306337214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_unfair_tos_rationale_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_unfair_tos_rationale_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_unfair_tos_rationale_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/cruzlorite/all-mpnet-base-v2-unfair-tos-rationale + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-attack_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2024-12-15-attack_bert_finetuned_ner_en.md new file mode 100644 index 00000000000000..b6d835618ddf5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-attack_bert_finetuned_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English attack_bert_finetuned_ner MPNetForTokenClassification from zohreaz +author: John Snow Labs +name: attack_bert_finetuned_ner +date: 2024-12-15 +tags: [en, open_source, onnx, token_classification, mpnet, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`attack_bert_finetuned_ner` is a English model originally trained by zohreaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/attack_bert_finetuned_ner_en_5.5.1_3.0_1734290705350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/attack_bert_finetuned_ner_en_5.5.1_3.0_1734290705350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = MPNetForTokenClassification.pretrained("attack_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = MPNetForTokenClassification.pretrained("attack_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|attack_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/zohreaz/ATTACK-BERT-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-attack_bert_finetuned_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-attack_bert_finetuned_ner_pipeline_en.md new file mode 100644 index 00000000000000..9a1a237177b6f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-attack_bert_finetuned_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English attack_bert_finetuned_ner_pipeline pipeline MPNetForTokenClassification from zohreaz +author: John Snow Labs +name: attack_bert_finetuned_ner_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`attack_bert_finetuned_ner_pipeline` is a English model originally trained by zohreaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/attack_bert_finetuned_ner_pipeline_en_5.5.1_3.0_1734290726087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/attack_bert_finetuned_ner_pipeline_en_5.5.1_3.0_1734290726087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("attack_bert_finetuned_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("attack_bert_finetuned_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|attack_bert_finetuned_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/zohreaz/ATTACK-BERT-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- MPNetForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-autotrain_9ikup_ih7yd_en.md b/docs/_posts/ahmedlone127/2024-12-15-autotrain_9ikup_ih7yd_en.md new file mode 100644 index 00000000000000..e8fdce94721c6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-autotrain_9ikup_ih7yd_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_9ikup_ih7yd MPNetForSequenceClassification from Milan97 +author: John Snow Labs +name: autotrain_9ikup_ih7yd +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, mpnet] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_9ikup_ih7yd` is a English model originally trained by Milan97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_9ikup_ih7yd_en_5.5.1_3.0_1734294957894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_9ikup_ih7yd_en_5.5.1_3.0_1734294957894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = MPNetForSequenceClassification.pretrained("autotrain_9ikup_ih7yd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = MPNetForSequenceClassification.pretrained("autotrain_9ikup_ih7yd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_9ikup_ih7yd| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/Milan97/autotrain-9ikup-ih7yd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-autotrain_9ikup_ih7yd_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-autotrain_9ikup_ih7yd_pipeline_en.md new file mode 100644 index 00000000000000..3bd5ef2f49cd15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-autotrain_9ikup_ih7yd_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English autotrain_9ikup_ih7yd_pipeline pipeline MPNetForSequenceClassification from Milan97 +author: John Snow Labs +name: autotrain_9ikup_ih7yd_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_9ikup_ih7yd_pipeline` is a English model originally trained by Milan97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_9ikup_ih7yd_pipeline_en_5.5.1_3.0_1734294979649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_9ikup_ih7yd_pipeline_en_5.5.1_3.0_1734294979649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("autotrain_9ikup_ih7yd_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("autotrain_9ikup_ih7yd_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_9ikup_ih7yd_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/Milan97/autotrain-9ikup-ih7yd + +## Included Models + +- DocumentAssembler +- TokenizerModel +- MPNetForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-autotrain_po0st_um4bf_en.md b/docs/_posts/ahmedlone127/2024-12-15-autotrain_po0st_um4bf_en.md new file mode 100644 index 00000000000000..861edf2b56efb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-autotrain_po0st_um4bf_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_po0st_um4bf MPNetForSequenceClassification from ulisesbravo +author: John Snow Labs +name: autotrain_po0st_um4bf +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, mpnet] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_po0st_um4bf` is a English model originally trained by ulisesbravo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_po0st_um4bf_en_5.5.1_3.0_1734294920330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_po0st_um4bf_en_5.5.1_3.0_1734294920330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = MPNetForSequenceClassification.pretrained("autotrain_po0st_um4bf","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = MPNetForSequenceClassification.pretrained("autotrain_po0st_um4bf", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_po0st_um4bf| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ulisesbravo/autotrain-po0st-um4bf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-autotrain_po0st_um4bf_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-autotrain_po0st_um4bf_pipeline_en.md new file mode 100644 index 00000000000000..80cb19e3be9d5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-autotrain_po0st_um4bf_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English autotrain_po0st_um4bf_pipeline pipeline MPNetForSequenceClassification from ulisesbravo +author: John Snow Labs +name: autotrain_po0st_um4bf_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_po0st_um4bf_pipeline` is a English model originally trained by ulisesbravo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_po0st_um4bf_pipeline_en_5.5.1_3.0_1734294941675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_po0st_um4bf_pipeline_en_5.5.1_3.0_1734294941675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("autotrain_po0st_um4bf_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("autotrain_po0st_um4bf_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_po0st_um4bf_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/ulisesbravo/autotrain-po0st-um4bf + +## Included Models + +- DocumentAssembler +- TokenizerModel +- MPNetForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_base_job_info_summarizer_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_base_job_info_summarizer_en.md new file mode 100644 index 00000000000000..b975c99be8e033 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_base_job_info_summarizer_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bart_base_job_info_summarizer BartTransformer from avisena +author: John Snow Labs +name: bart_base_job_info_summarizer +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_base_job_info_summarizer` is a English model originally trained by avisena. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_base_job_info_summarizer_en_5.5.1_3.0_1734305004314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_base_job_info_summarizer_en_5.5.1_3.0_1734305004314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("bart_base_job_info_summarizer","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("bart_base_job_info_summarizer","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_base_job_info_summarizer| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|809.6 MB| + +## References + +https://huggingface.co/avisena/bart-base-job-info-summarizer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_base_job_info_summarizer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_base_job_info_summarizer_pipeline_en.md new file mode 100644 index 00000000000000..2e59ee8d5fe6e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_base_job_info_summarizer_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bart_base_job_info_summarizer_pipeline pipeline BartTransformer from avisena +author: John Snow Labs +name: bart_base_job_info_summarizer_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_base_job_info_summarizer_pipeline` is a English model originally trained by avisena. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_base_job_info_summarizer_pipeline_en_5.5.1_3.0_1734305052624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_base_job_info_summarizer_pipeline_en_5.5.1_3.0_1734305052624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bart_base_job_info_summarizer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bart_base_job_info_summarizer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_base_job_info_summarizer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|809.6 MB| + +## References + +https://huggingface.co/avisena/bart-base-job-info-summarizer + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_large_cnn_samsum_chatgpt_v3_qiliang_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_large_cnn_samsum_chatgpt_v3_qiliang_en.md new file mode 100644 index 00000000000000..7e16dc570dd8a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_large_cnn_samsum_chatgpt_v3_qiliang_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bart_large_cnn_samsum_chatgpt_v3_qiliang BartTransformer from Qiliang +author: John Snow Labs +name: bart_large_cnn_samsum_chatgpt_v3_qiliang +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large_cnn_samsum_chatgpt_v3_qiliang` is a English model originally trained by Qiliang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_cnn_samsum_chatgpt_v3_qiliang_en_5.5.1_3.0_1734303933913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_cnn_samsum_chatgpt_v3_qiliang_en_5.5.1_3.0_1734303933913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("bart_large_cnn_samsum_chatgpt_v3_qiliang","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("bart_large_cnn_samsum_chatgpt_v3_qiliang","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large_cnn_samsum_chatgpt_v3_qiliang| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/Qiliang/bart-large-cnn-samsum-ChatGPT_v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline_en.md new file mode 100644 index 00000000000000..1fdcd347825201 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline pipeline BartTransformer from Qiliang +author: John Snow Labs +name: bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline` is a English model originally trained by Qiliang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline_en_5.5.1_3.0_1734304031272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline_en_5.5.1_3.0_1734304031272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large_cnn_samsum_chatgpt_v3_qiliang_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/Qiliang/bart-large-cnn-samsum-ChatGPT_v3 + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_facebook_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_facebook_en.md new file mode 100644 index 00000000000000..1591545617c2ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_facebook_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bart_large_xsum_facebook BartTransformer from facebook +author: John Snow Labs +name: bart_large_xsum_facebook +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large_xsum_facebook` is a English model originally trained by facebook. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_xsum_facebook_en_5.5.1_3.0_1734304947425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_xsum_facebook_en_5.5.1_3.0_1734304947425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("bart_large_xsum_facebook","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("bart_large_xsum_facebook","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large_xsum_facebook| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/facebook/bart-large-xsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_facebook_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_facebook_pipeline_en.md new file mode 100644 index 00000000000000..a40cae50099e7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_facebook_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bart_large_xsum_facebook_pipeline pipeline BartTransformer from facebook +author: John Snow Labs +name: bart_large_xsum_facebook_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large_xsum_facebook_pipeline` is a English model originally trained by facebook. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_xsum_facebook_pipeline_en_5.5.1_3.0_1734305279715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_xsum_facebook_pipeline_en_5.5.1_3.0_1734305279715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bart_large_xsum_facebook_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bart_large_xsum_facebook_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large_xsum_facebook_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/facebook/bart-large-xsum + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_samsum_lidiya_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_samsum_lidiya_en.md new file mode 100644 index 00000000000000..fb6fd117f07d48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_samsum_lidiya_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bart_large_xsum_samsum_lidiya BartTransformer from lidiya +author: John Snow Labs +name: bart_large_xsum_samsum_lidiya +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large_xsum_samsum_lidiya` is a English model originally trained by lidiya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_xsum_samsum_lidiya_en_5.5.1_3.0_1734304523620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_xsum_samsum_lidiya_en_5.5.1_3.0_1734304523620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("bart_large_xsum_samsum_lidiya","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("bart_large_xsum_samsum_lidiya","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large_xsum_samsum_lidiya| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/lidiya/bart-large-xsum-samsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_samsum_lidiya_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_samsum_lidiya_pipeline_en.md new file mode 100644 index 00000000000000..dc41ddb11f4a6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_large_xsum_samsum_lidiya_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bart_large_xsum_samsum_lidiya_pipeline pipeline BartTransformer from lidiya +author: John Snow Labs +name: bart_large_xsum_samsum_lidiya_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large_xsum_samsum_lidiya_pipeline` is a English model originally trained by lidiya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_xsum_samsum_lidiya_pipeline_en_5.5.1_3.0_1734304614990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_xsum_samsum_lidiya_pipeline_en_5.5.1_3.0_1734304614990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bart_large_xsum_samsum_lidiya_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bart_large_xsum_samsum_lidiya_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large_xsum_samsum_lidiya_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/lidiya/bart-large-xsum-samsum + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_small_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_small_en.md new file mode 100644 index 00000000000000..269d30082b573a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_small_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bart_small BartTransformer from lucadiliello +author: John Snow Labs +name: bart_small +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_small` is a English model originally trained by lucadiliello. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_small_en_5.5.1_3.0_1734304754480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_small_en_5.5.1_3.0_1734304754480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("bart_small","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("bart_small","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_small| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|456.2 MB| + +## References + +https://huggingface.co/lucadiliello/bart-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bart_small_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bart_small_pipeline_en.md new file mode 100644 index 00000000000000..b246a513a4045e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bart_small_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bart_small_pipeline pipeline BartTransformer from lucadiliello +author: John Snow Labs +name: bart_small_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_small_pipeline` is a English model originally trained by lucadiliello. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_small_pipeline_en_5.5.1_3.0_1734304777367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_small_pipeline_en_5.5.1_3.0_1734304777367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bart_small_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bart_small_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_small_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|456.2 MB| + +## References + +https://huggingface.co/lucadiliello/bart-small + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-behpouyan_ner_fa.md b/docs/_posts/ahmedlone127/2024-12-15-behpouyan_ner_fa.md new file mode 100644 index 00000000000000..f5a15539759b02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-behpouyan_ner_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian behpouyan_ner AlbertForTokenClassification from Behpouyan +author: John Snow Labs +name: behpouyan_ner +date: 2024-12-15 +tags: [fa, open_source, onnx, token_classification, albert, ner] +task: Named Entity Recognition +language: fa +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: AlbertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained AlbertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`behpouyan_ner` is a Persian model originally trained by Behpouyan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/behpouyan_ner_fa_5.5.1_3.0_1734288831584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/behpouyan_ner_fa_5.5.1_3.0_1734288831584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = AlbertForTokenClassification.pretrained("behpouyan_ner","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = AlbertForTokenClassification.pretrained("behpouyan_ner", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|behpouyan_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|fa| +|Size:|42.0 MB| + +## References + +https://huggingface.co/Behpouyan/Behpouyan-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-behpouyan_ner_pipeline_fa.md b/docs/_posts/ahmedlone127/2024-12-15-behpouyan_ner_pipeline_fa.md new file mode 100644 index 00000000000000..e63c5fa2e98683 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-behpouyan_ner_pipeline_fa.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Persian behpouyan_ner_pipeline pipeline AlbertForTokenClassification from Behpouyan +author: John Snow Labs +name: behpouyan_ner_pipeline +date: 2024-12-15 +tags: [fa, open_source, pipeline, onnx] +task: Named Entity Recognition +language: fa +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained AlbertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`behpouyan_ner_pipeline` is a Persian model originally trained by Behpouyan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/behpouyan_ner_pipeline_fa_5.5.1_3.0_1734288833834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/behpouyan_ner_pipeline_fa_5.5.1_3.0_1734288833834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("behpouyan_ner_pipeline", lang = "fa") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("behpouyan_ner_pipeline", lang = "fa") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|behpouyan_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|fa| +|Size:|42.0 MB| + +## References + +https://huggingface.co/Behpouyan/Behpouyan-NER + +## Included Models + +- DocumentAssembler +- TokenizerModel +- AlbertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_arabic_german_english_indonesian_japanese_wikidump_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_arabic_german_english_indonesian_japanese_wikidump_en.md new file mode 100644 index 00000000000000..9104c3a11d7b41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_arabic_german_english_indonesian_japanese_wikidump_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_arabic_german_english_indonesian_japanese_wikidump BertEmbeddings from dehanalkautsar +author: John Snow Labs +name: bert_arabic_german_english_indonesian_japanese_wikidump +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_arabic_german_english_indonesian_japanese_wikidump` is a English model originally trained by dehanalkautsar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_arabic_german_english_indonesian_japanese_wikidump_en_5.5.1_3.0_1734283673107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_arabic_german_english_indonesian_japanese_wikidump_en_5.5.1_3.0_1734283673107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("bert_arabic_german_english_indonesian_japanese_wikidump","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("bert_arabic_german_english_indonesian_japanese_wikidump","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_arabic_german_english_indonesian_japanese_wikidump| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/dehanalkautsar/bert_ar_de_en_id_ja_wikidump \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en.md new file mode 100644 index 00000000000000..f3c1041d911989 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_arabic_german_english_indonesian_japanese_wikidump_pipeline pipeline BertEmbeddings from dehanalkautsar +author: John Snow Labs +name: bert_arabic_german_english_indonesian_japanese_wikidump_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_arabic_german_english_indonesian_japanese_wikidump_pipeline` is a English model originally trained by dehanalkautsar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en_5.5.1_3.0_1734283694436.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en_5.5.1_3.0_1734283694436.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_arabic_german_english_indonesian_japanese_wikidump_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_arabic_german_english_indonesian_japanese_wikidump_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_arabic_german_english_indonesian_japanese_wikidump_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/dehanalkautsar/bert_ar_de_en_id_ja_wikidump + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_squad2_uncased_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_squad2_uncased_en.md new file mode 100644 index 00000000000000..ca2b48bb719e1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_squad2_uncased_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_squad2_uncased BertForQuestionAnswering from zelcakok +author: John Snow Labs +name: bert_base_squad2_uncased +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_squad2_uncased` is a English model originally trained by zelcakok. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_squad2_uncased_en_5.5.1_3.0_1734296841907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_squad2_uncased_en_5.5.1_3.0_1734296841907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_squad2_uncased","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_squad2_uncased", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_squad2_uncased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.6 MB| + +## References + +https://huggingface.co/zelcakok/bert-base-squad2-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_squad2_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_squad2_uncased_pipeline_en.md new file mode 100644 index 00000000000000..bc6fcea1cc3c7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_squad2_uncased_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_squad2_uncased_pipeline pipeline BertForQuestionAnswering from zelcakok +author: John Snow Labs +name: bert_base_squad2_uncased_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_squad2_uncased_pipeline` is a English model originally trained by zelcakok. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_squad2_uncased_pipeline_en_5.5.1_3.0_1734296862615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_squad2_uncased_pipeline_en_5.5.1_3.0_1734296862615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_squad2_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_squad2_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_squad2_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.6 MB| + +## References + +https://huggingface.co/zelcakok/bert-base-squad2-uncased + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_train_book_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_train_book_en.md new file mode 100644 index 00000000000000..83d63910d8cc2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_train_book_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_train_book DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: bert_base_train_book +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_train_book` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_train_book_en_5.5.1_3.0_1734289269979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_train_book_en_5.5.1_3.0_1734289269979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("bert_base_train_book","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("bert_base_train_book","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_train_book| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/bert_base_train_book \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_train_book_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_train_book_pipeline_en.md new file mode 100644 index 00000000000000..41d33e87b0a7dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_train_book_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_train_book_pipeline pipeline DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: bert_base_train_book_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_train_book_pipeline` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_train_book_pipeline_en_5.5.1_3.0_1734289291578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_train_book_pipeline_en_5.5.1_3.0_1734289291578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_train_book_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_train_book_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_train_book_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/bert_base_train_book + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_en.md new file mode 100644 index 00000000000000..5586f0673fffd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_en_5.5.1_3.0_1734297187410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_en_5.5.1_3.0_1734297187410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-4e-06-wd-0.1-dp-0.55-ss-12345 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline_en.md new file mode 100644 index 00000000000000..a2407b410ad399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline_en_5.5.1_3.0_1734297211408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline_en_5.5.1_3.0_1734297211408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_4e_06_wd_0_1_dp_0_55_swati_12345_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-4e-06-wd-0.1-dp-0.55-ss-12345 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_en.md new file mode 100644 index 00000000000000..3743e86f38584a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_en_5.5.1_3.0_1734297161214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_en_5.5.1_3.0_1734297161214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-5e-06-wd-0.001-dp-0.2-ss-0-st-False-fh-False-hs-500 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline_en.md new file mode 100644 index 00000000000000..7cf5426e30244b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline_en_5.5.1_3.0_1734297181922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline_en_5.5.1_3.0_1734297181922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_5e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_500_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-5e-06-wd-0.001-dp-0.2-ss-0-st-False-fh-False-hs-500 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_en.md new file mode 100644 index 00000000000000..d41845c3b3265b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_en_5.5.1_3.0_1734297454796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_en_5.5.1_3.0_1734297454796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.6-lr-1e-05-wd-0.001-dp-0.1-ss-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline_en.md new file mode 100644 index 00000000000000..80852eb81ad213 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline_en_5.5.1_3.0_1734297476489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline_en_5.5.1_3.0_1734297476489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_6_lr_1e_05_wd_0_001_dp_0_1_swati_0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.6-lr-1e-05-wd-0.001-dp-0.1-ss-0 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_en.md new file mode 100644 index 00000000000000..47ed16561019aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_en_5.5.1_3.0_1734297044112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_en_5.5.1_3.0_1734297044112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-1e-05-wd-0.001-dp-0.2-ss-0-st-False-fh-False-hs-800 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline_en.md new file mode 100644 index 00000000000000..74c2c2cf63d8b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline_en_5.5.1_3.0_1734297065591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline_en_5.5.1_3.0_1734297065591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_800_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-1e-05-wd-0.001-dp-0.2-ss-0-st-False-fh-False-hs-800 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_en.md new file mode 100644 index 00000000000000..b1663b571781ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_en_5.5.1_3.0_1734296630655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_en_5.5.1_3.0_1734296630655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-5e-05-wd-0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline_en.md new file mode 100644 index 00000000000000..f1ecb2800b5d6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline_en_5.5.1_3.0_1734296652179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline_en_5.5.1_3.0_1734296652179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-5e-05-wd-0.1 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_en.md new file mode 100644 index 00000000000000..096a9e85de3e46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_en_5.5.1_3.0_1734297477809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_en_5.5.1_3.0_1734297477809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-3.0-lr-1e-05-wd-0.001-dp-0.7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline_en.md new file mode 100644 index 00000000000000..24dc6de756678d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline_en_5.5.1_3.0_1734297501682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline_en_5.5.1_3.0_1734297501682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_3_0_lr_1e_05_wd_0_001_dp_0_7_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-3.0-lr-1e-05-wd-0.001-dp-0.7 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_en.md new file mode 100644 index 00000000000000..5d5d69725fb23e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_en_5.5.1_3.0_1734297084720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_en_5.5.1_3.0_1734297084720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-4.0-lr-1e-06-wd-0.001-dp-0.2-ss-0-st-False-fh-False-hs-900 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline_en.md new file mode 100644 index 00000000000000..8d079627774266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline_en_5.5.1_3.0_1734297105443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline_en_5.5.1_3.0_1734297105443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_0_southern_sotho_false_fh_false_hs_900_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-4.0-lr-1e-06-wd-0.001-dp-0.2-ss-0-st-False-fh-False-hs-900 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_lauraparra28_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_lauraparra28_en.md new file mode 100644 index 00000000000000..23327d73c08aab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_lauraparra28_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_lauraparra28 BertForQuestionAnswering from lauraparra28 +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_lauraparra28 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_lauraparra28` is a English model originally trained by lauraparra28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_lauraparra28_en_5.5.1_3.0_1734296703264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_lauraparra28_en_5.5.1_3.0_1734296703264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_lauraparra28","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_lauraparra28", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_lauraparra28| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/lauraparra28/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_lauraparra28_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_lauraparra28_pipeline_en.md new file mode 100644 index 00000000000000..76071157c603d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_lauraparra28_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_lauraparra28_pipeline pipeline BertForQuestionAnswering from lauraparra28 +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_lauraparra28_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_lauraparra28_pipeline` is a English model originally trained by lauraparra28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_lauraparra28_pipeline_en_5.5.1_3.0_1734296724944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_lauraparra28_pipeline_en_5.5.1_3.0_1734296724944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_squad_lauraparra28_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_squad_lauraparra28_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_lauraparra28_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/lauraparra28/bert-base-uncased-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_real_jiakai_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_real_jiakai_en.md new file mode 100644 index 00000000000000..4c0af4aabd82e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_real_jiakai_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_real_jiakai BertForQuestionAnswering from real-jiakai +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_real_jiakai +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_real_jiakai` is a English model originally trained by real-jiakai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_real_jiakai_en_5.5.1_3.0_1734297340897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_real_jiakai_en_5.5.1_3.0_1734297340897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_real_jiakai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_real_jiakai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_real_jiakai| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/real-jiakai/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_real_jiakai_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_real_jiakai_pipeline_en.md new file mode 100644 index 00000000000000..7742552a0e9749 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_base_uncased_finetuned_squad_real_jiakai_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_real_jiakai_pipeline pipeline BertForQuestionAnswering from real-jiakai +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_real_jiakai_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_real_jiakai_pipeline` is a English model originally trained by real-jiakai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_real_jiakai_pipeline_en_5.5.1_3.0_1734297362671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_real_jiakai_pipeline_en_5.5.1_3.0_1734297362671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetuned_squad_real_jiakai_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetuned_squad_real_jiakai_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_real_jiakai_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/real-jiakai/bert-base-uncased-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_covid_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_covid_en.md new file mode 100644 index 00000000000000..f48d79f524d2c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_covid_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_covid BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_covid +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_covid` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_covid_en_5.5.1_3.0_1734297165616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_covid_en_5.5.1_3.0_1734297165616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_covid","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_covid", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_covid| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/hung200504/bert-covid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_covid_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_covid_pipeline_en.md new file mode 100644 index 00000000000000..148e54ee47d36b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_covid_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_covid_pipeline pipeline BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_covid_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_covid_pipeline` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_covid_pipeline_en_5.5.1_3.0_1734297188867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_covid_pipeline_en_5.5.1_3.0_1734297188867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_covid_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_covid_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_covid_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/hung200504/bert-covid + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_ner_en.md new file mode 100644 index 00000000000000..c9212e2962538b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_ner_en.md @@ -0,0 +1,96 @@ +--- +layout: model +title: English bert_finetuned_ner RoBertaForTokenClassification from mdroth +author: John Snow Labs +name: bert_finetuned_ner +date: 2024-12-15 +tags: [en, open_source, onnx, token_classification, roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: AlbertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner` is a English model originally trained by mdroth. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_en_5.5.1_3.0_1734288769300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_en_5.5.1_3.0_1734288769300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = RoBertaForTokenClassification.pretrained("bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = RoBertaForTokenClassification.pretrained("bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|46.7 MB| + +## References + +References + +https://huggingface.co/mdroth/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_ner_pipeline_en.md new file mode 100644 index 00000000000000..1996e2c7e543ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_ner_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English bert_finetuned_ner_pipeline pipeline RoBertaForTokenClassification from mdroth +author: John Snow Labs +name: bert_finetuned_ner_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_pipeline` is a English model originally trained by mdroth. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_pipeline_en_5.5.1_3.0_1734288772062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_pipeline_en_5.5.1_3.0_1734288772062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("bert_finetuned_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("bert_finetuned_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|46.7 MB| + +## References + +References + +https://huggingface.co/mdroth/bert-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- AlbertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_aarnow_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_aarnow_en.md new file mode 100644 index 00000000000000..bff180baeba45f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_aarnow_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_finetuned_squad_aarnow BertForQuestionAnswering from aarnow +author: John Snow Labs +name: bert_finetuned_squad_aarnow +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_aarnow` is a English model originally trained by aarnow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_aarnow_en_5.5.1_3.0_1734297329397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_aarnow_en_5.5.1_3.0_1734297329397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_aarnow","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_aarnow", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_aarnow| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/aarnow/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_aarnow_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_aarnow_pipeline_en.md new file mode 100644 index 00000000000000..bb327986e25b21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_aarnow_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_finetuned_squad_aarnow_pipeline pipeline BertForQuestionAnswering from aarnow +author: John Snow Labs +name: bert_finetuned_squad_aarnow_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_aarnow_pipeline` is a English model originally trained by aarnow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_aarnow_pipeline_en_5.5.1_3.0_1734297349374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_aarnow_pipeline_en_5.5.1_3.0_1734297349374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_squad_aarnow_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_squad_aarnow_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_aarnow_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/aarnow/bert-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_tuna1283_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_tuna1283_en.md new file mode 100644 index 00000000000000..2294fea81c894a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_tuna1283_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_finetuned_squad_tuna1283 BertForQuestionAnswering from tuna1283 +author: John Snow Labs +name: bert_finetuned_squad_tuna1283 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_tuna1283` is a English model originally trained by tuna1283. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tuna1283_en_5.5.1_3.0_1734297381465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tuna1283_en_5.5.1_3.0_1734297381465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_tuna1283","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_tuna1283", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_tuna1283| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/tuna1283/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_tuna1283_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_tuna1283_pipeline_en.md new file mode 100644 index 00000000000000..6330d6209a86f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_finetuned_squad_tuna1283_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_finetuned_squad_tuna1283_pipeline pipeline BertForQuestionAnswering from tuna1283 +author: John Snow Labs +name: bert_finetuned_squad_tuna1283_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_tuna1283_pipeline` is a English model originally trained by tuna1283. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tuna1283_pipeline_en_5.5.1_3.0_1734297402152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tuna1283_pipeline_en_5.5.1_3.0_1734297402152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_squad_tuna1283_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_squad_tuna1283_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_tuna1283_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/tuna1283/bert-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline_en.md new file mode 100644 index 00000000000000..204d2201e2a1c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline pipeline MPNetEmbeddings from abhijitt +author: John Snow Labs +name: bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline` is a English model originally trained by abhijitt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline_en_5.5.1_3.0_1734306065952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline_en_5.5.1_3.0_1734306065952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_southern_sotho_qa_multi_qa_mpnet_base_dot_v1_epochs_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/abhijitt/bert_st_qa_multi-qa-mpnet-base-dot-v1-epochs-1 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_unformatted_network_data_test_6_types_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_unformatted_network_data_test_6_types_en.md new file mode 100644 index 00000000000000..fb2e6a90e69901 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_unformatted_network_data_test_6_types_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_unformatted_network_data_test_6_types RoBertaForSequenceClassification from Jios +author: John Snow Labs +name: bert_unformatted_network_data_test_6_types +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_unformatted_network_data_test_6_types` is a English model originally trained by Jios. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_unformatted_network_data_test_6_types_en_5.5.1_3.0_1734288045905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_unformatted_network_data_test_6_types_en_5.5.1_3.0_1734288045905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bert_unformatted_network_data_test_6_types","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bert_unformatted_network_data_test_6_types", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_unformatted_network_data_test_6_types| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Jios/bert-unformatted-network-data-test-6-types \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-bert_unformatted_network_data_test_6_types_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-bert_unformatted_network_data_test_6_types_pipeline_en.md new file mode 100644 index 00000000000000..ca356ca61ff1c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-bert_unformatted_network_data_test_6_types_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_unformatted_network_data_test_6_types_pipeline pipeline RoBertaForSequenceClassification from Jios +author: John Snow Labs +name: bert_unformatted_network_data_test_6_types_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_unformatted_network_data_test_6_types_pipeline` is a English model originally trained by Jios. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_unformatted_network_data_test_6_types_pipeline_en_5.5.1_3.0_1734288141021.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_unformatted_network_data_test_6_types_pipeline_en_5.5.1_3.0_1734288141021.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_unformatted_network_data_test_6_types_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_unformatted_network_data_test_6_types_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_unformatted_network_data_test_6_types_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Jios/bert-unformatted-network-data-test-6-types + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_opus_books_model_heartiels_en.md b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_opus_books_model_heartiels_en.md new file mode 100644 index 00000000000000..fa64aa201741bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_opus_books_model_heartiels_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_heartiels T5Transformer from Heartiels +author: John Snow Labs +name: burmese_awesome_opus_books_model_heartiels +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_heartiels` is a English model originally trained by Heartiels. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_heartiels_en_5.5.1_3.0_1734301547724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_heartiels_en_5.5.1_3.0_1734301547724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_heartiels","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_heartiels", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_heartiels| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|340.5 MB| + +## References + +https://huggingface.co/Heartiels/my_awesome_opus_books_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_opus_books_model_heartiels_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_opus_books_model_heartiels_pipeline_en.md new file mode 100644 index 00000000000000..defbf139043e12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_opus_books_model_heartiels_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_heartiels_pipeline pipeline T5Transformer from Heartiels +author: John Snow Labs +name: burmese_awesome_opus_books_model_heartiels_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_heartiels_pipeline` is a English model originally trained by Heartiels. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_heartiels_pipeline_en_5.5.1_3.0_1734301567144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_heartiels_pipeline_en_5.5.1_3.0_1734301567144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_opus_books_model_heartiels_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_opus_books_model_heartiels_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_heartiels_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|340.5 MB| + +## References + +https://huggingface.co/Heartiels/my_awesome_opus_books_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_setfit_model_en.md b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_setfit_model_en.md new file mode 100644 index 00000000000000..91b07da1e572db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_setfit_model_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English burmese_awesome_setfit_model MPNetEmbeddings from lewtun +author: John Snow Labs +name: burmese_awesome_setfit_model +date: 2024-12-15 +tags: [mpnet, en, open_source, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_setfit_model` is a English model originally trained by lewtun. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_setfit_model_en_5.5.1_3.0_1734306768820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_setfit_model_en_5.5.1_3.0_1734306768820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =MPNetEmbeddings.pretrained("burmese_awesome_setfit_model","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("mpnet_embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("documents") + +val embeddings = MPNetEmbeddings + .pretrained("burmese_awesome_setfit_model", "en") + .setInputCols(Array("documents")) + .setOutputCol("mpnet_embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_setfit_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +References + +https://huggingface.co/lewtun/my-awesome-setfit-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_setfit_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_setfit_model_pipeline_en.md new file mode 100644 index 00000000000000..a1f444107f7c7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-burmese_awesome_setfit_model_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_awesome_setfit_model_pipeline pipeline MPNetEmbeddings from ilhkn +author: John Snow Labs +name: burmese_awesome_setfit_model_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_setfit_model_pipeline` is a English model originally trained by ilhkn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_setfit_model_pipeline_en_5.5.1_3.0_1734306789393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_setfit_model_pipeline_en_5.5.1_3.0_1734306789393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_setfit_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_setfit_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_setfit_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/ilhkn/my-awesome-setfit-model + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-caramelo_smile_pipeline_pt.md b/docs/_posts/ahmedlone127/2024-12-15-caramelo_smile_pipeline_pt.md new file mode 100644 index 00000000000000..ae3081f5cf11b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-caramelo_smile_pipeline_pt.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Portuguese caramelo_smile_pipeline pipeline RoBertaForSequenceClassification from Adilmar +author: John Snow Labs +name: caramelo_smile_pipeline +date: 2024-12-15 +tags: [pt, open_source, pipeline, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`caramelo_smile_pipeline` is a Portuguese model originally trained by Adilmar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/caramelo_smile_pipeline_pt_5.5.1_3.0_1734287418431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/caramelo_smile_pipeline_pt_5.5.1_3.0_1734287418431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("caramelo_smile_pipeline", lang = "pt") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("caramelo_smile_pipeline", lang = "pt") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|caramelo_smile_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|pt| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Adilmar/caramelo-smile + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-caramelo_smile_pt.md b/docs/_posts/ahmedlone127/2024-12-15-caramelo_smile_pt.md new file mode 100644 index 00000000000000..a64261b19bdb03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-caramelo_smile_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese caramelo_smile RoBertaForSequenceClassification from Adilmar +author: John Snow Labs +name: caramelo_smile +date: 2024-12-15 +tags: [pt, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: pt +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`caramelo_smile` is a Portuguese model originally trained by Adilmar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/caramelo_smile_pt_5.5.1_3.0_1734287392171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/caramelo_smile_pt_5.5.1_3.0_1734287392171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("caramelo_smile","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("caramelo_smile", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|caramelo_smile| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Adilmar/caramelo-smile \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-cendol_mt5_small_inst_finetuned_maltese_kupang_malay_en.md b/docs/_posts/ahmedlone127/2024-12-15-cendol_mt5_small_inst_finetuned_maltese_kupang_malay_en.md new file mode 100644 index 00000000000000..34072efd3672e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-cendol_mt5_small_inst_finetuned_maltese_kupang_malay_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English cendol_mt5_small_inst_finetuned_maltese_kupang_malay T5Transformer from joanitolopo +author: John Snow Labs +name: cendol_mt5_small_inst_finetuned_maltese_kupang_malay +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cendol_mt5_small_inst_finetuned_maltese_kupang_malay` is a English model originally trained by joanitolopo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cendol_mt5_small_inst_finetuned_maltese_kupang_malay_en_5.5.1_3.0_1734302501848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cendol_mt5_small_inst_finetuned_maltese_kupang_malay_en_5.5.1_3.0_1734302501848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("cendol_mt5_small_inst_finetuned_maltese_kupang_malay","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("cendol_mt5_small_inst_finetuned_maltese_kupang_malay", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cendol_mt5_small_inst_finetuned_maltese_kupang_malay| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/joanitolopo/cendol-mt5-small-inst-finetuned-mt-kupang-malay \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline_en.md new file mode 100644 index 00000000000000..52e28870656827 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline pipeline T5Transformer from joanitolopo +author: John Snow Labs +name: cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline` is a English model originally trained by joanitolopo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline_en_5.5.1_3.0_1734302636335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline_en_5.5.1_3.0_1734302636335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cendol_mt5_small_inst_finetuned_maltese_kupang_malay_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/joanitolopo/cendol-mt5-small-inst-finetuned-mt-kupang-malay + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_en.md b/docs/_posts/ahmedlone127/2024-12-15-chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_en.md new file mode 100644 index 00000000000000..05dbc1a44956ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector BertForQuestionAnswering from riiwang +author: John Snow Labs +name: chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector` is a English model originally trained by riiwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_en_5.5.1_3.0_1734297201812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_en_5.5.1_3.0_1734297201812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/riiwang/chinese-roberta-wwm-ext_lr_5e-05_batch_8_epoch_3_model_span_selector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline_en.md new file mode 100644 index 00000000000000..73f9d3df3c5d10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline pipeline BertForQuestionAnswering from riiwang +author: John Snow Labs +name: chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline` is a English model originally trained by riiwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline_en_5.5.1_3.0_1734297220885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline_en_5.5.1_3.0_1734297220885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_lr_5e_05_batch_8_epoch_3_model_span_selector_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/riiwang/chinese-roberta-wwm-ext_lr_5e-05_batch_8_epoch_3_model_span_selector + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-clinical_mobilebert_en.md b/docs/_posts/ahmedlone127/2024-12-15-clinical_mobilebert_en.md new file mode 100644 index 00000000000000..76ae1abab626c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-clinical_mobilebert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English clinical_mobilebert BertEmbeddings from nlpie +author: John Snow Labs +name: clinical_mobilebert +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinical_mobilebert` is a English model originally trained by nlpie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinical_mobilebert_en_5.5.1_3.0_1734284336372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinical_mobilebert_en_5.5.1_3.0_1734284336372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("clinical_mobilebert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("clinical_mobilebert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinical_mobilebert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|92.5 MB| + +## References + +https://huggingface.co/nlpie/clinical-mobilebert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-clinical_mobilebert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-clinical_mobilebert_pipeline_en.md new file mode 100644 index 00000000000000..2873dc3210749a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-clinical_mobilebert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English clinical_mobilebert_pipeline pipeline BertEmbeddings from nlpie +author: John Snow Labs +name: clinical_mobilebert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinical_mobilebert_pipeline` is a English model originally trained by nlpie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinical_mobilebert_pipeline_en_5.5.1_3.0_1734284340800.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinical_mobilebert_pipeline_en_5.5.1_3.0_1734284340800.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("clinical_mobilebert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("clinical_mobilebert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinical_mobilebert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|92.6 MB| + +## References + +https://huggingface.co/nlpie/clinical-mobilebert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-cnn_news_summary_model_trained_on_reduced_data_kacharuk_en.md b/docs/_posts/ahmedlone127/2024-12-15-cnn_news_summary_model_trained_on_reduced_data_kacharuk_en.md new file mode 100644 index 00000000000000..e620b300f13c8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-cnn_news_summary_model_trained_on_reduced_data_kacharuk_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English cnn_news_summary_model_trained_on_reduced_data_kacharuk T5Transformer from Kacharuk +author: John Snow Labs +name: cnn_news_summary_model_trained_on_reduced_data_kacharuk +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cnn_news_summary_model_trained_on_reduced_data_kacharuk` is a English model originally trained by Kacharuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_kacharuk_en_5.5.1_3.0_1734302275817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_kacharuk_en_5.5.1_3.0_1734302275817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("cnn_news_summary_model_trained_on_reduced_data_kacharuk","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("cnn_news_summary_model_trained_on_reduced_data_kacharuk", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cnn_news_summary_model_trained_on_reduced_data_kacharuk| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|323.8 MB| + +## References + +https://huggingface.co/Kacharuk/cnn_news_summary_model_trained_on_reduced_data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline_en.md new file mode 100644 index 00000000000000..08a661a156abed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline pipeline T5Transformer from Kacharuk +author: John Snow Labs +name: cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline` is a English model originally trained by Kacharuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline_en_5.5.1_3.0_1734302297555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline_en_5.5.1_3.0_1734302297555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cnn_news_summary_model_trained_on_reduced_data_kacharuk_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|323.8 MB| + +## References + +https://huggingface.co/Kacharuk/cnn_news_summary_model_trained_on_reduced_data + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-comparacion_t5_congelado_en.md b/docs/_posts/ahmedlone127/2024-12-15-comparacion_t5_congelado_en.md new file mode 100644 index 00000000000000..02e644163e79ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-comparacion_t5_congelado_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English comparacion_t5_congelado T5Transformer from MartinElMolon +author: John Snow Labs +name: comparacion_t5_congelado +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comparacion_t5_congelado` is a English model originally trained by MartinElMolon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comparacion_t5_congelado_en_5.5.1_3.0_1734299554996.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comparacion_t5_congelado_en_5.5.1_3.0_1734299554996.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("comparacion_t5_congelado","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("comparacion_t5_congelado", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comparacion_t5_congelado| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|759.7 MB| + +## References + +https://huggingface.co/MartinElMolon/comparacion_T5_congelado \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-comparacion_t5_congelado_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-comparacion_t5_congelado_pipeline_en.md new file mode 100644 index 00000000000000..1bfbc7de19818b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-comparacion_t5_congelado_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English comparacion_t5_congelado_pipeline pipeline T5Transformer from MartinElMolon +author: John Snow Labs +name: comparacion_t5_congelado_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comparacion_t5_congelado_pipeline` is a English model originally trained by MartinElMolon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comparacion_t5_congelado_pipeline_en_5.5.1_3.0_1734299666088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comparacion_t5_congelado_pipeline_en_5.5.1_3.0_1734299666088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("comparacion_t5_congelado_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("comparacion_t5_congelado_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comparacion_t5_congelado_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|759.7 MB| + +## References + +https://huggingface.co/MartinElMolon/comparacion_T5_congelado + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-danish_prosusai_finbert_en.md b/docs/_posts/ahmedlone127/2024-12-15-danish_prosusai_finbert_en.md new file mode 100644 index 00000000000000..6ca228e59ba466 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-danish_prosusai_finbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English danish_prosusai_finbert BertEmbeddings from NLP-FEUP +author: John Snow Labs +name: danish_prosusai_finbert +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_prosusai_finbert` is a English model originally trained by NLP-FEUP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_prosusai_finbert_en_5.5.1_3.0_1734284423651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_prosusai_finbert_en_5.5.1_3.0_1734284423651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("danish_prosusai_finbert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("danish_prosusai_finbert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_prosusai_finbert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/NLP-FEUP/DA-ProsusAI-finbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-danish_prosusai_finbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-danish_prosusai_finbert_pipeline_en.md new file mode 100644 index 00000000000000..ce3bd28e0c7846 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-danish_prosusai_finbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English danish_prosusai_finbert_pipeline pipeline BertEmbeddings from NLP-FEUP +author: John Snow Labs +name: danish_prosusai_finbert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_prosusai_finbert_pipeline` is a English model originally trained by NLP-FEUP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_prosusai_finbert_pipeline_en_5.5.1_3.0_1734284445235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_prosusai_finbert_pipeline_en_5.5.1_3.0_1734284445235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("danish_prosusai_finbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("danish_prosusai_finbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_prosusai_finbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/NLP-FEUP/DA-ProsusAI-finbert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-debiasing_pre_trained_contextualised_embeddings_distil_bert_en.md b/docs/_posts/ahmedlone127/2024-12-15-debiasing_pre_trained_contextualised_embeddings_distil_bert_en.md new file mode 100644 index 00000000000000..215d074368d371 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-debiasing_pre_trained_contextualised_embeddings_distil_bert_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English debiasing_pre_trained_contextualised_embeddings_distil_bert DistilBertEmbeddings from Daniel-Saeedi +author: John Snow Labs +name: debiasing_pre_trained_contextualised_embeddings_distil_bert +date: 2024-12-15 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`debiasing_pre_trained_contextualised_embeddings_distil_bert` is a English model originally trained by Daniel-Saeedi. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/debiasing_pre_trained_contextualised_embeddings_distil_bert_en_5.5.1_3.0_1734289221102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/debiasing_pre_trained_contextualised_embeddings_distil_bert_en_5.5.1_3.0_1734289221102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("debiasing_pre_trained_contextualised_embeddings_distil_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("debiasing_pre_trained_contextualised_embeddings_distil_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|debiasing_pre_trained_contextualised_embeddings_distil_bert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +References + +https://huggingface.co/Daniel-Saeedi/debiasing_pre-trained_contextualised_embeddings_distil_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline_en.md new file mode 100644 index 00000000000000..dfccf243773c2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline pipeline DistilBertEmbeddings from DS-20202 +author: John Snow Labs +name: debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline` is a English model originally trained by DS-20202. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline_en_5.5.1_3.0_1734289235605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline_en_5.5.1_3.0_1734289235605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|debiasing_pre_trained_contextualised_embeddings_distil_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/DS-20202/debiasing_pre-trained_contextualised_embeddings_distil_bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbart_cnn_12_3_sshleifer_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbart_cnn_12_3_sshleifer_en.md new file mode 100644 index 00000000000000..a2f1154ae12558 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbart_cnn_12_3_sshleifer_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English distilbart_cnn_12_3_sshleifer BartTransformer from sshleifer +author: John Snow Labs +name: distilbart_cnn_12_3_sshleifer +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbart_cnn_12_3_sshleifer` is a English model originally trained by sshleifer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbart_cnn_12_3_sshleifer_en_5.5.1_3.0_1734303452195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbart_cnn_12_3_sshleifer_en_5.5.1_3.0_1734303452195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("distilbart_cnn_12_3_sshleifer","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("distilbart_cnn_12_3_sshleifer","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbart_cnn_12_3_sshleifer| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|925.6 MB| + +## References + +https://huggingface.co/sshleifer/distilbart-cnn-12-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbart_cnn_12_3_sshleifer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbart_cnn_12_3_sshleifer_pipeline_en.md new file mode 100644 index 00000000000000..61a2eb20445cc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbart_cnn_12_3_sshleifer_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English distilbart_cnn_12_3_sshleifer_pipeline pipeline BartTransformer from sshleifer +author: John Snow Labs +name: distilbart_cnn_12_3_sshleifer_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbart_cnn_12_3_sshleifer_pipeline` is a English model originally trained by sshleifer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbart_cnn_12_3_sshleifer_pipeline_en_5.5.1_3.0_1734303663837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbart_cnn_12_3_sshleifer_pipeline_en_5.5.1_3.0_1734303663837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbart_cnn_12_3_sshleifer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbart_cnn_12_3_sshleifer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbart_cnn_12_3_sshleifer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|925.6 MB| + +## References + +https://huggingface.co/sshleifer/distilbart-cnn-12-3 + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbart_xsum_12_6_sshleifer_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbart_xsum_12_6_sshleifer_en.md new file mode 100644 index 00000000000000..2ef13a2ce52fd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbart_xsum_12_6_sshleifer_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English distilbart_xsum_12_6_sshleifer BartTransformer from sshleifer +author: John Snow Labs +name: distilbart_xsum_12_6_sshleifer +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbart_xsum_12_6_sshleifer` is a English model originally trained by sshleifer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbart_xsum_12_6_sshleifer_en_5.5.1_3.0_1734303678373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbart_xsum_12_6_sshleifer_en_5.5.1_3.0_1734303678373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("distilbart_xsum_12_6_sshleifer","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("distilbart_xsum_12_6_sshleifer","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbart_xsum_12_6_sshleifer| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|977.6 MB| + +## References + +https://huggingface.co/sshleifer/distilbart-xsum-12-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbart_xsum_12_6_sshleifer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbart_xsum_12_6_sshleifer_pipeline_en.md new file mode 100644 index 00000000000000..f5cc702ea7f563 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbart_xsum_12_6_sshleifer_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English distilbart_xsum_12_6_sshleifer_pipeline pipeline BartTransformer from sshleifer +author: John Snow Labs +name: distilbart_xsum_12_6_sshleifer_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbart_xsum_12_6_sshleifer_pipeline` is a English model originally trained by sshleifer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbart_xsum_12_6_sshleifer_pipeline_en_5.5.1_3.0_1734303947985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbart_xsum_12_6_sshleifer_pipeline_en_5.5.1_3.0_1734303947985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbart_xsum_12_6_sshleifer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbart_xsum_12_6_sshleifer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbart_xsum_12_6_sshleifer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|977.6 MB| + +## References + +https://huggingface.co/sshleifer/distilbart-xsum-12-6 + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_cased_finetuned_bible_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_cased_finetuned_bible_en.md new file mode 100644 index 00000000000000..38bde3b5319aaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_cased_finetuned_bible_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_bible DistilBertEmbeddings from Pragash-Mohanarajah +author: John Snow Labs +name: distilbert_base_cased_finetuned_bible +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_bible` is a English model originally trained by Pragash-Mohanarajah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_bible_en_5.5.1_3.0_1734290255267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_bible_en_5.5.1_3.0_1734290255267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_cased_finetuned_bible","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_cased_finetuned_bible","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_bible| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Pragash-Mohanarajah/distilbert-base-cased-finetuned-bible \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_cased_finetuned_bible_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_cased_finetuned_bible_pipeline_en.md new file mode 100644 index 00000000000000..675ba9a23b1886 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_cased_finetuned_bible_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_bible_pipeline pipeline DistilBertEmbeddings from Pragash-Mohanarajah +author: John Snow Labs +name: distilbert_base_cased_finetuned_bible_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_bible_pipeline` is a English model originally trained by Pragash-Mohanarajah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_bible_pipeline_en_5.5.1_3.0_1734290268629.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_bible_pipeline_en_5.5.1_3.0_1734290268629.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_cased_finetuned_bible_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_cased_finetuned_bible_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_bible_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Pragash-Mohanarajah/distilbert-base-cased-finetuned-bible + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_lda_train_book_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_lda_train_book_en.md new file mode 100644 index 00000000000000..0dde874ca2be05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_lda_train_book_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_lda_train_book DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: distilbert_base_lda_train_book +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_lda_train_book` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_lda_train_book_en_5.5.1_3.0_1734289667791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_lda_train_book_en_5.5.1_3.0_1734289667791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_lda_train_book","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_lda_train_book","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_lda_train_book| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|248.8 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/distilbert_base_lda_train_book \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_lda_train_book_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_lda_train_book_pipeline_en.md new file mode 100644 index 00000000000000..a76127af2c0fd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_lda_train_book_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_lda_train_book_pipeline pipeline DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: distilbert_base_lda_train_book_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_lda_train_book_pipeline` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_lda_train_book_pipeline_en_5.5.1_3.0_1734289681638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_lda_train_book_pipeline_en_5.5.1_3.0_1734289681638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_lda_train_book_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_lda_train_book_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_lda_train_book_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|248.8 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/distilbert_base_lda_train_book + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_train_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_train_en.md new file mode 100644 index 00000000000000..5b4b7df45ce93a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_train_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_train DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: distilbert_base_train +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_train` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_train_en_5.5.1_3.0_1734290136789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_train_en_5.5.1_3.0_1734290136789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_train","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_train","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_train| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|248.9 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/distilbert_base_train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_train_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_train_pipeline_en.md new file mode 100644 index 00000000000000..4366466c6b6aca --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_train_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_train_pipeline pipeline DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: distilbert_base_train_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_train_pipeline` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_train_pipeline_en_5.5.1_3.0_1734290150934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_train_pipeline_en_5.5.1_3.0_1734290150934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_train_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_train_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_train_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|248.9 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/distilbert_base_train + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_czech_bigred_local_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_czech_bigred_local_en.md new file mode 100644 index 00000000000000..900cf5e2c05bb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_czech_bigred_local_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_czech_bigred_local DistilBertEmbeddings from fahadal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_czech_bigred_local +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_czech_bigred_local` is a English model originally trained by fahadal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_czech_bigred_local_en_5.5.1_3.0_1734289506680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_czech_bigred_local_en_5.5.1_3.0_1734289506680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_czech_bigred_local","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_czech_bigred_local","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_czech_bigred_local| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fahadal/distilbert-base-uncased-finetuned-cs-bigred-local \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_czech_bigred_local_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_czech_bigred_local_pipeline_en.md new file mode 100644 index 00000000000000..bd8379cc8a92e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_czech_bigred_local_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_czech_bigred_local_pipeline pipeline DistilBertEmbeddings from fahadal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_czech_bigred_local_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_czech_bigred_local_pipeline` is a English model originally trained by fahadal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_czech_bigred_local_pipeline_en_5.5.1_3.0_1734289527776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_czech_bigred_local_pipeline_en_5.5.1_3.0_1734289527776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_czech_bigred_local_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_czech_bigred_local_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_czech_bigred_local_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fahadal/distilbert-base-uncased-finetuned-cs-bigred-local + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_aatta_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_aatta_en.md new file mode 100644 index 00000000000000..40ab3dd2f77e87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_aatta_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_aatta DistilBertEmbeddings from attardan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_aatta +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_aatta` is a English model originally trained by attardan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_aatta_en_5.5.1_3.0_1734289340664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_aatta_en_5.5.1_3.0_1734289340664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_aatta","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_aatta","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_aatta| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/attardan/distilbert-base-uncased-finetuned-imdb-AATTA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_aatta_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_aatta_pipeline_en.md new file mode 100644 index 00000000000000..fb4e9ce76ea1fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_aatta_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_aatta_pipeline pipeline DistilBertEmbeddings from attardan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_aatta_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_aatta_pipeline` is a English model originally trained by attardan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_aatta_pipeline_en_5.5.1_3.0_1734289354039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_aatta_pipeline_en_5.5.1_3.0_1734289354039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_aatta_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_aatta_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_aatta_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/attardan/distilbert-base-uncased-finetuned-imdb-AATTA + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_achakr37_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_achakr37_en.md new file mode 100644 index 00000000000000..cd71ffc0e9743c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_achakr37_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_achakr37 DistilBertEmbeddings from achakr37 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_achakr37 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_achakr37` is a English model originally trained by achakr37. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_achakr37_en_5.5.1_3.0_1734289369590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_achakr37_en_5.5.1_3.0_1734289369590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_achakr37","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_achakr37","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_achakr37| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/achakr37/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_achakr37_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_achakr37_pipeline_en.md new file mode 100644 index 00000000000000..b988086e981256 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_achakr37_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_achakr37_pipeline pipeline DistilBertEmbeddings from achakr37 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_achakr37_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_achakr37_pipeline` is a English model originally trained by achakr37. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_achakr37_pipeline_en_5.5.1_3.0_1734289385298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_achakr37_pipeline_en_5.5.1_3.0_1734289385298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_achakr37_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_achakr37_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_achakr37_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/achakr37/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_blitherboom_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_blitherboom_en.md new file mode 100644 index 00000000000000..d8df1f5d01f95a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_blitherboom_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_blitherboom DistilBertEmbeddings from BlitherBoom +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_blitherboom +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_blitherboom` is a English model originally trained by BlitherBoom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_blitherboom_en_5.5.1_3.0_1734289638520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_blitherboom_en_5.5.1_3.0_1734289638520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_blitherboom","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_blitherboom","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_blitherboom| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/BlitherBoom/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline_en.md new file mode 100644 index 00000000000000..b4a4b130ec38f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline pipeline DistilBertEmbeddings from BlitherBoom +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline` is a English model originally trained by BlitherBoom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline_en_5.5.1_3.0_1734289653784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline_en_5.5.1_3.0_1734289653784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_blitherboom_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/BlitherBoom/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_c4se_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_c4se_en.md new file mode 100644 index 00000000000000..1bc2a1879cdd8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_c4se_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_c4se DistilBertEmbeddings from C4SE +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_c4se +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_c4se` is a English model originally trained by C4SE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_c4se_en_5.5.1_3.0_1734289771736.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_c4se_en_5.5.1_3.0_1734289771736.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_c4se","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_c4se","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_c4se| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/C4SE/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_c4se_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_c4se_pipeline_en.md new file mode 100644 index 00000000000000..a13fbc910e638b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_c4se_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_c4se_pipeline pipeline DistilBertEmbeddings from C4SE +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_c4se_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_c4se_pipeline` is a English model originally trained by C4SE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_c4se_pipeline_en_5.5.1_3.0_1734289788729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_c4se_pipeline_en_5.5.1_3.0_1734289788729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_c4se_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_c4se_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_c4se_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/C4SE/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_chessmen_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_chessmen_en.md new file mode 100644 index 00000000000000..b61b5f0554fe6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_chessmen_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_chessmen DistilBertEmbeddings from Chessmen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_chessmen +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_chessmen` is a English model originally trained by Chessmen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_chessmen_en_5.5.1_3.0_1734290094371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_chessmen_en_5.5.1_3.0_1734290094371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_chessmen","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_chessmen","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_chessmen| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chessmen/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_chessmen_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_chessmen_pipeline_en.md new file mode 100644 index 00000000000000..4a850b06ec4f01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_chessmen_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_chessmen_pipeline pipeline DistilBertEmbeddings from Chessmen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_chessmen_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_chessmen_pipeline` is a English model originally trained by Chessmen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_chessmen_pipeline_en_5.5.1_3.0_1734290107564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_chessmen_pipeline_en_5.5.1_3.0_1734290107564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_chessmen_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_chessmen_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_chessmen_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chessmen/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_cxbn12_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_cxbn12_en.md new file mode 100644 index 00000000000000..b5b1cfa416deea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_cxbn12_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_cxbn12 DistilBertEmbeddings from cxbn12 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_cxbn12 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_cxbn12` is a English model originally trained by cxbn12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_cxbn12_en_5.5.1_3.0_1734289997736.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_cxbn12_en_5.5.1_3.0_1734289997736.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_cxbn12","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_cxbn12","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_cxbn12| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cxbn12/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline_en.md new file mode 100644 index 00000000000000..5d6157897f4203 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline pipeline DistilBertEmbeddings from cxbn12 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline` is a English model originally trained by cxbn12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline_en_5.5.1_3.0_1734290011154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline_en_5.5.1_3.0_1734290011154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_cxbn12_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cxbn12/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_delayedkarma_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_delayedkarma_en.md new file mode 100644 index 00000000000000..fe274de5968f20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_delayedkarma_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_delayedkarma DistilBertEmbeddings from delayedkarma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_delayedkarma +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_delayedkarma` is a English model originally trained by delayedkarma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_delayedkarma_en_5.5.1_3.0_1734290035918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_delayedkarma_en_5.5.1_3.0_1734290035918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_delayedkarma","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_delayedkarma","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_delayedkarma| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/delayedkarma/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline_en.md new file mode 100644 index 00000000000000..a3442cec3ce0c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline pipeline DistilBertEmbeddings from delayedkarma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline` is a English model originally trained by delayedkarma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline_en_5.5.1_3.0_1734290049344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline_en_5.5.1_3.0_1734290049344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_delayedkarma_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/delayedkarma/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dhmo1900_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dhmo1900_en.md new file mode 100644 index 00000000000000..2ab545535401b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dhmo1900_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_dhmo1900 DistilBertEmbeddings from Dhmo1900 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_dhmo1900 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_dhmo1900` is a English model originally trained by Dhmo1900. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dhmo1900_en_5.5.1_3.0_1734290153627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dhmo1900_en_5.5.1_3.0_1734290153627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_dhmo1900","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_dhmo1900","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_dhmo1900| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Dhmo1900/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline_en.md new file mode 100644 index 00000000000000..2858a5fe64aa05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline pipeline DistilBertEmbeddings from Dhmo1900 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline` is a English model originally trained by Dhmo1900. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline_en_5.5.1_3.0_1734290168799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline_en_5.5.1_3.0_1734290168799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_dhmo1900_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Dhmo1900/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dqsuper_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dqsuper_en.md new file mode 100644 index 00000000000000..6137b84d72aede --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dqsuper_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_dqsuper DistilBertEmbeddings from dqsuper +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_dqsuper +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_dqsuper` is a English model originally trained by dqsuper. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dqsuper_en_5.5.1_3.0_1734289511023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dqsuper_en_5.5.1_3.0_1734289511023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_dqsuper","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_dqsuper","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_dqsuper| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dqsuper/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline_en.md new file mode 100644 index 00000000000000..a3bf3a9ab75c91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline pipeline DistilBertEmbeddings from dqsuper +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline` is a English model originally trained by dqsuper. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline_en_5.5.1_3.0_1734289526661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline_en_5.5.1_3.0_1734289526661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_dqsuper_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dqsuper/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_hayatoshibahara_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_hayatoshibahara_en.md new file mode 100644 index 00000000000000..4c448269a60503 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_hayatoshibahara_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_hayatoshibahara DistilBertEmbeddings from hayatoshibahara +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_hayatoshibahara +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_hayatoshibahara` is a English model originally trained by hayatoshibahara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_hayatoshibahara_en_5.5.1_3.0_1734289644855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_hayatoshibahara_en_5.5.1_3.0_1734289644855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_hayatoshibahara","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_hayatoshibahara","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_hayatoshibahara| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hayatoshibahara/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline_en.md new file mode 100644 index 00000000000000..42340c5e4d9e40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline pipeline DistilBertEmbeddings from hayatoshibahara +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline` is a English model originally trained by hayatoshibahara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline_en_5.5.1_3.0_1734289658249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline_en_5.5.1_3.0_1734289658249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_hayatoshibahara_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hayatoshibahara/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jackson107_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jackson107_en.md new file mode 100644 index 00000000000000..f08d5d101b6383 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jackson107_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_jackson107 DistilBertEmbeddings from Jackson107 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_jackson107 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_jackson107` is a English model originally trained by Jackson107. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jackson107_en_5.5.1_3.0_1734289449697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jackson107_en_5.5.1_3.0_1734289449697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_jackson107","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_jackson107","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_jackson107| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Jackson107/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jackson107_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jackson107_pipeline_en.md new file mode 100644 index 00000000000000..e9d7d2e5c724c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jackson107_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_jackson107_pipeline pipeline DistilBertEmbeddings from Jackson107 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_jackson107_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_jackson107_pipeline` is a English model originally trained by Jackson107. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jackson107_pipeline_en_5.5.1_3.0_1734289462972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jackson107_pipeline_en_5.5.1_3.0_1734289462972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_jackson107_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_jackson107_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_jackson107_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Jackson107/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jhhan_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jhhan_en.md new file mode 100644 index 00000000000000..0cad4dd60c47ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jhhan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_jhhan DistilBertEmbeddings from JHhan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_jhhan +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_jhhan` is a English model originally trained by JHhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jhhan_en_5.5.1_3.0_1734289505028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jhhan_en_5.5.1_3.0_1734289505028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_jhhan","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_jhhan","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_jhhan| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JHhan/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jhhan_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jhhan_pipeline_en.md new file mode 100644 index 00000000000000..fe32372f83cf39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_jhhan_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_jhhan_pipeline pipeline DistilBertEmbeddings from JHhan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_jhhan_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_jhhan_pipeline` is a English model originally trained by JHhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jhhan_pipeline_en_5.5.1_3.0_1734289527801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_jhhan_pipeline_en_5.5.1_3.0_1734289527801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_jhhan_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_jhhan_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_jhhan_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JHhan/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_joyle_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_joyle_en.md new file mode 100644 index 00000000000000..7f419122769e1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_joyle_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_joyle DistilBertEmbeddings from joyle +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_joyle +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_joyle` is a English model originally trained by joyle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_joyle_en_5.5.1_3.0_1734289245302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_joyle_en_5.5.1_3.0_1734289245302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_joyle","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_joyle","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_joyle| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/joyle/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_joyle_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_joyle_pipeline_en.md new file mode 100644 index 00000000000000..12ad020e4ace39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_joyle_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_joyle_pipeline pipeline DistilBertEmbeddings from joyle +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_joyle_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_joyle_pipeline` is a English model originally trained by joyle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_joyle_pipeline_en_5.5.1_3.0_1734289261965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_joyle_pipeline_en_5.5.1_3.0_1734289261965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_joyle_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_joyle_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_joyle_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/joyle/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_laura0000_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_laura0000_en.md new file mode 100644 index 00000000000000..6ed9aff4dfe35d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_laura0000_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_laura0000 DistilBertEmbeddings from laura0000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_laura0000 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_laura0000` is a English model originally trained by laura0000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_laura0000_en_5.5.1_3.0_1734289783264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_laura0000_en_5.5.1_3.0_1734289783264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_laura0000","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_laura0000","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_laura0000| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/laura0000/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_laura0000_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_laura0000_pipeline_en.md new file mode 100644 index 00000000000000..5441f1294f08e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_laura0000_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_laura0000_pipeline pipeline DistilBertEmbeddings from laura0000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_laura0000_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_laura0000_pipeline` is a English model originally trained by laura0000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_laura0000_pipeline_en_5.5.1_3.0_1734289797264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_laura0000_pipeline_en_5.5.1_3.0_1734289797264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_laura0000_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_laura0000_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_laura0000_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/laura0000/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_linqus_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_linqus_en.md new file mode 100644 index 00000000000000..4d3933aebd321a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_linqus_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_linqus DistilBertEmbeddings from linqus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_linqus +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_linqus` is a English model originally trained by linqus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_linqus_en_5.5.1_3.0_1734289369090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_linqus_en_5.5.1_3.0_1734289369090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_linqus","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_linqus","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_linqus| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/linqus/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_linqus_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_linqus_pipeline_en.md new file mode 100644 index 00000000000000..84c47e413cc78b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_linqus_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_linqus_pipeline pipeline DistilBertEmbeddings from linqus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_linqus_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_linqus_pipeline` is a English model originally trained by linqus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_linqus_pipeline_en_5.5.1_3.0_1734289387315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_linqus_pipeline_en_5.5.1_3.0_1734289387315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_linqus_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_linqus_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_linqus_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/linqus/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_longma98_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_longma98_en.md new file mode 100644 index 00000000000000..8ed4b9738fd749 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_longma98_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_longma98 DistilBertEmbeddings from longma98 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_longma98 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_longma98` is a English model originally trained by longma98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_longma98_en_5.5.1_3.0_1734289388654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_longma98_en_5.5.1_3.0_1734289388654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_longma98","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_longma98","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_longma98| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/longma98/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_longma98_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_longma98_pipeline_en.md new file mode 100644 index 00000000000000..9ed99462f94e83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_longma98_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_longma98_pipeline pipeline DistilBertEmbeddings from longma98 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_longma98_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_longma98_pipeline` is a English model originally trained by longma98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_longma98_pipeline_en_5.5.1_3.0_1734289405637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_longma98_pipeline_en_5.5.1_3.0_1734289405637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_longma98_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_longma98_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_longma98_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/longma98/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_muhbdeir_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_muhbdeir_en.md new file mode 100644 index 00000000000000..af02dd0442bb44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_muhbdeir_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_muhbdeir DistilBertEmbeddings from muhbdeir +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_muhbdeir +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_muhbdeir` is a English model originally trained by muhbdeir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_muhbdeir_en_5.5.1_3.0_1734290016587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_muhbdeir_en_5.5.1_3.0_1734290016587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_muhbdeir","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_muhbdeir","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_muhbdeir| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/muhbdeir/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline_en.md new file mode 100644 index 00000000000000..784bb5de2d8f55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline pipeline DistilBertEmbeddings from muhbdeir +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline` is a English model originally trained by muhbdeir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline_en_5.5.1_3.0_1734290030189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline_en_5.5.1_3.0_1734290030189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_muhbdeir_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/muhbdeir/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rjomega_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rjomega_en.md new file mode 100644 index 00000000000000..a2eb4203494606 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rjomega_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_rjomega DistilBertEmbeddings from rjomega +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_rjomega +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_rjomega` is a English model originally trained by rjomega. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rjomega_en_5.5.1_3.0_1734289636704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rjomega_en_5.5.1_3.0_1734289636704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_rjomega","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_rjomega","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_rjomega| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rjomega/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rjomega_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rjomega_pipeline_en.md new file mode 100644 index 00000000000000..c6552d4d2c798c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rjomega_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_rjomega_pipeline pipeline DistilBertEmbeddings from rjomega +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_rjomega_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_rjomega_pipeline` is a English model originally trained by rjomega. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rjomega_pipeline_en_5.5.1_3.0_1734289653781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rjomega_pipeline_en_5.5.1_3.0_1734289653781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_rjomega_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_rjomega_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_rjomega_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rjomega/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rock520_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rock520_en.md new file mode 100644 index 00000000000000..399f5fe50b075d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rock520_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_rock520 DistilBertEmbeddings from Rock520 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_rock520 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_rock520` is a English model originally trained by Rock520. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rock520_en_5.5.1_3.0_1734290307572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rock520_en_5.5.1_3.0_1734290307572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_rock520","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_rock520","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_rock520| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Rock520/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rock520_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rock520_pipeline_en.md new file mode 100644 index 00000000000000..cf3c485a56337f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_rock520_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_rock520_pipeline pipeline DistilBertEmbeddings from Rock520 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_rock520_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_rock520_pipeline` is a English model originally trained by Rock520. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rock520_pipeline_en_5.5.1_3.0_1734290320271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rock520_pipeline_en_5.5.1_3.0_1734290320271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_rock520_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_rock520_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_rock520_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Rock520/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_schubertcarvalho_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_schubertcarvalho_en.md new file mode 100644 index 00000000000000..6d2094da1eda4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_schubertcarvalho_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_schubertcarvalho DistilBertEmbeddings from schubertcarvalho +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_schubertcarvalho +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_schubertcarvalho` is a English model originally trained by schubertcarvalho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_schubertcarvalho_en_5.5.1_3.0_1734290206147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_schubertcarvalho_en_5.5.1_3.0_1734290206147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_schubertcarvalho","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_schubertcarvalho","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_schubertcarvalho| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/schubertcarvalho/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline_en.md new file mode 100644 index 00000000000000..88d5c6ddcd2a13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline pipeline DistilBertEmbeddings from schubertcarvalho +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline` is a English model originally trained by schubertcarvalho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline_en_5.5.1_3.0_1734290220034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline_en_5.5.1_3.0_1734290220034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_schubertcarvalho_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/schubertcarvalho/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_spikymelon_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_spikymelon_en.md new file mode 100644 index 00000000000000..f7562d4c87d099 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_spikymelon_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_spikymelon DistilBertEmbeddings from spikymelon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_spikymelon +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_spikymelon` is a English model originally trained by spikymelon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_spikymelon_en_5.5.1_3.0_1734289244779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_spikymelon_en_5.5.1_3.0_1734289244779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_spikymelon","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_spikymelon","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_spikymelon| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/spikymelon/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline_en.md new file mode 100644 index 00000000000000..37e985656bc8fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline pipeline DistilBertEmbeddings from spikymelon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline` is a English model originally trained by spikymelon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline_en_5.5.1_3.0_1734289260155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline_en_5.5.1_3.0_1734289260155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_spikymelon_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/spikymelon/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_en.md new file mode 100644 index 00000000000000..4f972793cb17d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_strawhatdrag0n DistilBertEmbeddings from StrawHatDrag0n +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_strawhatdrag0n +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_strawhatdrag0n` is a English model originally trained by StrawHatDrag0n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_en_5.5.1_3.0_1734289763663.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_en_5.5.1_3.0_1734289763663.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_strawhatdrag0n","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_strawhatdrag0n","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_strawhatdrag0n| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/StrawHatDrag0n/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline_en.md new file mode 100644 index 00000000000000..50e6a11014f576 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline pipeline DistilBertEmbeddings from StrawHatDrag0n +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline` is a English model originally trained by StrawHatDrag0n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline_en_5.5.1_3.0_1734289778238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline_en_5.5.1_3.0_1734289778238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_strawhatdrag0n_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/StrawHatDrag0n/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_sushanthreddy99_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_sushanthreddy99_en.md new file mode 100644 index 00000000000000..04f74435d1fe63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_sushanthreddy99_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_sushanthreddy99 DistilBertEmbeddings from sushanthreddy99 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_sushanthreddy99 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_sushanthreddy99` is a English model originally trained by sushanthreddy99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_sushanthreddy99_en_5.5.1_3.0_1734289396204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_sushanthreddy99_en_5.5.1_3.0_1734289396204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_sushanthreddy99","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_sushanthreddy99","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_sushanthreddy99| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sushanthreddy99/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline_en.md new file mode 100644 index 00000000000000..dfd7e7a9958168 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline pipeline DistilBertEmbeddings from sushanthreddy99 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline` is a English model originally trained by sushanthreddy99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline_en_5.5.1_3.0_1734289410283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline_en_5.5.1_3.0_1734289410283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_sushanthreddy99_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sushanthreddy99/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_valentinguigon_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_valentinguigon_en.md new file mode 100644 index 00000000000000..c4ab7d99cd2f46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_valentinguigon_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_valentinguigon DistilBertEmbeddings from ValentinGuigon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_valentinguigon +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_valentinguigon` is a English model originally trained by ValentinGuigon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_valentinguigon_en_5.5.1_3.0_1734290361697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_valentinguigon_en_5.5.1_3.0_1734290361697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_valentinguigon","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_valentinguigon","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_valentinguigon| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ValentinGuigon/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline_en.md new file mode 100644 index 00000000000000..c3dda16225dda1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline pipeline DistilBertEmbeddings from ValentinGuigon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline` is a English model originally trained by ValentinGuigon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline_en_5.5.1_3.0_1734290374251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline_en_5.5.1_3.0_1734290374251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_valentinguigon_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ValentinGuigon/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_vohuutridung_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_vohuutridung_en.md new file mode 100644 index 00000000000000..2f1765db0f99b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_vohuutridung_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_vohuutridung DistilBertEmbeddings from VoHuuTriDung +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_vohuutridung +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_vohuutridung` is a English model originally trained by VoHuuTriDung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_vohuutridung_en_5.5.1_3.0_1734289900327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_vohuutridung_en_5.5.1_3.0_1734289900327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_vohuutridung","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_vohuutridung","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_vohuutridung| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/VoHuuTriDung/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline_en.md new file mode 100644 index 00000000000000..c2240545bb5171 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline pipeline DistilBertEmbeddings from VoHuuTriDung +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline` is a English model originally trained by VoHuuTriDung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline_en_5.5.1_3.0_1734289914161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline_en_5.5.1_3.0_1734289914161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_vohuutridung_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/VoHuuTriDung/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_yangwhale_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_yangwhale_en.md new file mode 100644 index 00000000000000..75e77dc4bca0e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_yangwhale_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_yangwhale DistilBertEmbeddings from yangwhale +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_yangwhale +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_yangwhale` is a English model originally trained by yangwhale. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_yangwhale_en_5.5.1_3.0_1734289560638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_yangwhale_en_5.5.1_3.0_1734289560638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_yangwhale","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_yangwhale","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_yangwhale| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yangwhale/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline_en.md new file mode 100644 index 00000000000000..1640a5f4dcd9e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline pipeline DistilBertEmbeddings from yangwhale +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline` is a English model originally trained by yangwhale. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline_en_5.5.1_3.0_1734289573753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline_en_5.5.1_3.0_1734289573753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_yangwhale_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yangwhale/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_ratemyprof_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_ratemyprof_en.md new file mode 100644 index 00000000000000..b677eb5372a3fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_ratemyprof_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ratemyprof DistilBertEmbeddings from herald-of-spring +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ratemyprof +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ratemyprof` is a English model originally trained by herald-of-spring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ratemyprof_en_5.5.1_3.0_1734289771918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ratemyprof_en_5.5.1_3.0_1734289771918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_ratemyprof","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_ratemyprof","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ratemyprof| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herald-of-spring/distilbert-base-uncased-finetuned-ratemyprof \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_ratemyprof_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_ratemyprof_pipeline_en.md new file mode 100644 index 00000000000000..df9727bc1427b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_ratemyprof_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ratemyprof_pipeline pipeline DistilBertEmbeddings from herald-of-spring +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ratemyprof_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ratemyprof_pipeline` is a English model originally trained by herald-of-spring. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ratemyprof_pipeline_en_5.5.1_3.0_1734289788622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ratemyprof_pipeline_en_5.5.1_3.0_1734289788622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ratemyprof_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ratemyprof_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ratemyprof_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herald-of-spring/distilbert-base-uncased-finetuned-ratemyprof + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_synthetic_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_synthetic_en.md new file mode 100644 index 00000000000000..9b2dfda1719411 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_synthetic_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_synthetic DistilBertEmbeddings from Chrisantha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_synthetic +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_synthetic` is a English model originally trained by Chrisantha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_synthetic_en_5.5.1_3.0_1734290285354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_synthetic_en_5.5.1_3.0_1734290285354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_synthetic","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_synthetic","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_synthetic| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chrisantha/distilbert-base-uncased-finetuned-synthetic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_synthetic_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_synthetic_pipeline_en.md new file mode 100644 index 00000000000000..2ff53abd5f39a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_base_uncased_finetuned_synthetic_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_synthetic_pipeline pipeline DistilBertEmbeddings from Chrisantha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_synthetic_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_synthetic_pipeline` is a English model originally trained by Chrisantha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_synthetic_pipeline_en_5.5.1_3.0_1734290299690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_synthetic_pipeline_en_5.5.1_3.0_1734290299690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_synthetic_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_synthetic_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_synthetic_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Chrisantha/distilbert-base-uncased-finetuned-synthetic + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_0_4_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_0_4_en.md new file mode 100644 index 00000000000000..6bddcf1bb81d3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_0_4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_sql_imdb_0_4 DistilBertEmbeddings from h40vv3n +author: John Snow Labs +name: distilbert_sql_imdb_0_4 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sql_imdb_0_4` is a English model originally trained by h40vv3n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_0_4_en_5.5.1_3.0_1734289973197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_0_4_en_5.5.1_3.0_1734289973197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_sql_imdb_0_4","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_sql_imdb_0_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sql_imdb_0_4| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|262.2 MB| + +## References + +https://huggingface.co/h40vv3n/distilbert-sql-imdb-0.4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_0_4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_0_4_pipeline_en.md new file mode 100644 index 00000000000000..9862b08aee8e3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_0_4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_sql_imdb_0_4_pipeline pipeline DistilBertEmbeddings from h40vv3n +author: John Snow Labs +name: distilbert_sql_imdb_0_4_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sql_imdb_0_4_pipeline` is a English model originally trained by h40vv3n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_0_4_pipeline_en_5.5.1_3.0_1734289988537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_0_4_pipeline_en_5.5.1_3.0_1734289988537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_sql_imdb_0_4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_sql_imdb_0_4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sql_imdb_0_4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|262.2 MB| + +## References + +https://huggingface.co/h40vv3n/distilbert-sql-imdb-0.4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_en.md new file mode 100644 index 00000000000000..497027849cf48f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_sql_imdb DistilBertEmbeddings from h40vv3n +author: John Snow Labs +name: distilbert_sql_imdb +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sql_imdb` is a English model originally trained by h40vv3n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_en_5.5.1_3.0_1734289805700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_en_5.5.1_3.0_1734289805700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_sql_imdb","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_sql_imdb","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sql_imdb| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|262.2 MB| + +## References + +https://huggingface.co/h40vv3n/distilbert-sql-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_pipeline_en.md new file mode 100644 index 00000000000000..17428d3cfad749 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_sql_imdb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_sql_imdb_pipeline pipeline DistilBertEmbeddings from h40vv3n +author: John Snow Labs +name: distilbert_sql_imdb_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sql_imdb_pipeline` is a English model originally trained by h40vv3n. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_pipeline_en_5.5.1_3.0_1734289820063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sql_imdb_pipeline_en_5.5.1_3.0_1734289820063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_sql_imdb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_sql_imdb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sql_imdb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|262.2 MB| + +## References + +https://huggingface.co/h40vv3n/distilbert-sql-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_trained_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_trained_en.md new file mode 100644 index 00000000000000..f31f189473ab13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_trained_en.md @@ -0,0 +1,96 @@ +--- +layout: model +title: English distilbert_trained DistilBertForTokenClassification from AlfredBink +author: John Snow Labs +name: distilbert_trained +date: 2024-12-15 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_trained` is a English model originally trained by AlfredBink. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_trained_en_5.5.1_3.0_1734289895271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_trained_en_5.5.1_3.0_1734289895271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_trained","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_trained", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_trained| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +References + +https://huggingface.co/AlfredBink/distilbert-trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilbert_trained_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilbert_trained_pipeline_en.md new file mode 100644 index 00000000000000..97d7408e6c5e10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilbert_trained_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English distilbert_trained_pipeline pipeline DistilBertForTokenClassification from AlfredBink +author: John Snow Labs +name: distilbert_trained_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_trained_pipeline` is a English model originally trained by AlfredBink. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_trained_pipeline_en_5.5.1_3.0_1734289910122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_trained_pipeline_en_5.5.1_3.0_1734289910122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("distilbert_trained_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("distilbert_trained_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_trained_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +References + +https://huggingface.co/AlfredBink/distilbert-trained + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distillbert_finetune_imdbs_mlm_en.md b/docs/_posts/ahmedlone127/2024-12-15-distillbert_finetune_imdbs_mlm_en.md new file mode 100644 index 00000000000000..9ce7dbe75a2166 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distillbert_finetune_imdbs_mlm_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distillbert_finetune_imdbs_mlm DistilBertEmbeddings from VuHuy +author: John Snow Labs +name: distillbert_finetune_imdbs_mlm +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_finetune_imdbs_mlm` is a English model originally trained by VuHuy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_finetune_imdbs_mlm_en_5.5.1_3.0_1734290129667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_finetune_imdbs_mlm_en_5.5.1_3.0_1734290129667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distillbert_finetune_imdbs_mlm","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distillbert_finetune_imdbs_mlm","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_finetune_imdbs_mlm| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/VuHuy/distillbert-finetune-imdbs-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distillbert_finetune_imdbs_mlm_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distillbert_finetune_imdbs_mlm_pipeline_en.md new file mode 100644 index 00000000000000..289ff1013fc19a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distillbert_finetune_imdbs_mlm_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distillbert_finetune_imdbs_mlm_pipeline pipeline DistilBertEmbeddings from VuHuy +author: John Snow Labs +name: distillbert_finetune_imdbs_mlm_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_finetune_imdbs_mlm_pipeline` is a English model originally trained by VuHuy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_finetune_imdbs_mlm_pipeline_en_5.5.1_3.0_1734290143166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_finetune_imdbs_mlm_pipeline_en_5.5.1_3.0_1734290143166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distillbert_finetune_imdbs_mlm_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distillbert_finetune_imdbs_mlm_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_finetune_imdbs_mlm_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/VuHuy/distillbert-finetune-imdbs-mlm + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilroberta_base_mrpc_glue_eugenioroma_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilroberta_base_mrpc_glue_eugenioroma_en.md new file mode 100644 index 00000000000000..f270cd0a3311b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilroberta_base_mrpc_glue_eugenioroma_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilroberta_base_mrpc_glue_eugenioroma RoBertaForSequenceClassification from EugenioRoma +author: John Snow Labs +name: distilroberta_base_mrpc_glue_eugenioroma +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mrpc_glue_eugenioroma` is a English model originally trained by EugenioRoma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_eugenioroma_en_5.5.1_3.0_1734287530119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_eugenioroma_en_5.5.1_3.0_1734287530119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_eugenioroma","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_eugenioroma", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mrpc_glue_eugenioroma| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/EugenioRoma/distilroberta-base-mrpc-glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-distilroberta_base_mrpc_glue_eugenioroma_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-distilroberta_base_mrpc_glue_eugenioroma_pipeline_en.md new file mode 100644 index 00000000000000..de3f92c97c5793 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-distilroberta_base_mrpc_glue_eugenioroma_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilroberta_base_mrpc_glue_eugenioroma_pipeline pipeline RoBertaForSequenceClassification from EugenioRoma +author: John Snow Labs +name: distilroberta_base_mrpc_glue_eugenioroma_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mrpc_glue_eugenioroma_pipeline` is a English model originally trained by EugenioRoma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_eugenioroma_pipeline_en_5.5.1_3.0_1734287546485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_eugenioroma_pipeline_en_5.5.1_3.0_1734287546485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilroberta_base_mrpc_glue_eugenioroma_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilroberta_base_mrpc_glue_eugenioroma_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mrpc_glue_eugenioroma_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/EugenioRoma/distilroberta-base-mrpc-glue + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-emotion_classifier_surya0105_en.md b/docs/_posts/ahmedlone127/2024-12-15-emotion_classifier_surya0105_en.md new file mode 100644 index 00000000000000..cd1be8f3e193c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-emotion_classifier_surya0105_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English emotion_classifier_surya0105 RoBertaForSequenceClassification from Surya0105 +author: John Snow Labs +name: emotion_classifier_surya0105 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_classifier_surya0105` is a English model originally trained by Surya0105. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_classifier_surya0105_en_5.5.1_3.0_1734287214885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_classifier_surya0105_en_5.5.1_3.0_1734287214885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_classifier_surya0105","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_classifier_surya0105", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_classifier_surya0105| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.7 MB| + +## References + +https://huggingface.co/Surya0105/Emotion-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-emotion_classifier_surya0105_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-emotion_classifier_surya0105_pipeline_en.md new file mode 100644 index 00000000000000..c1e39bd465b32d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-emotion_classifier_surya0105_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English emotion_classifier_surya0105_pipeline pipeline RoBertaForSequenceClassification from Surya0105 +author: John Snow Labs +name: emotion_classifier_surya0105_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_classifier_surya0105_pipeline` is a English model originally trained by Surya0105. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_classifier_surya0105_pipeline_en_5.5.1_3.0_1734287242292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_classifier_surya0105_pipeline_en_5.5.1_3.0_1734287242292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("emotion_classifier_surya0105_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("emotion_classifier_surya0105_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_classifier_surya0105_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|453.7 MB| + +## References + +https://huggingface.co/Surya0105/Emotion-Classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-englishtoshakespearean_en.md b/docs/_posts/ahmedlone127/2024-12-15-englishtoshakespearean_en.md new file mode 100644 index 00000000000000..5cb52b784abebe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-englishtoshakespearean_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English englishtoshakespearean T5Transformer from AerlenTheStout +author: John Snow Labs +name: englishtoshakespearean +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`englishtoshakespearean` is a English model originally trained by AerlenTheStout. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/englishtoshakespearean_en_5.5.1_3.0_1734298893251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/englishtoshakespearean_en_5.5.1_3.0_1734298893251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("englishtoshakespearean","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("englishtoshakespearean", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|englishtoshakespearean| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/AerlenTheStout/EnglishToShakespearean \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-englishtoshakespearean_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-englishtoshakespearean_pipeline_en.md new file mode 100644 index 00000000000000..6aecf8939d74af --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-englishtoshakespearean_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English englishtoshakespearean_pipeline pipeline T5Transformer from AerlenTheStout +author: John Snow Labs +name: englishtoshakespearean_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`englishtoshakespearean_pipeline` is a English model originally trained by AerlenTheStout. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/englishtoshakespearean_pipeline_en_5.5.1_3.0_1734298910967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/englishtoshakespearean_pipeline_en_5.5.1_3.0_1734298910967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("englishtoshakespearean_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("englishtoshakespearean_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|englishtoshakespearean_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/AerlenTheStout/EnglishToShakespearean + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-et5_spelling_correction_en.md b/docs/_posts/ahmedlone127/2024-12-15-et5_spelling_correction_en.md new file mode 100644 index 00000000000000..70f5a45d4cc1c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-et5_spelling_correction_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English et5_spelling_correction T5Transformer from yurim111 +author: John Snow Labs +name: et5_spelling_correction +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`et5_spelling_correction` is a English model originally trained by yurim111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/et5_spelling_correction_en_5.5.1_3.0_1734302589788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/et5_spelling_correction_en_5.5.1_3.0_1734302589788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("et5_spelling_correction","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("et5_spelling_correction", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|et5_spelling_correction| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yurim111/et5-spelling-correction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-et5_spelling_correction_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-et5_spelling_correction_pipeline_en.md new file mode 100644 index 00000000000000..0924bbff521acd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-et5_spelling_correction_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English et5_spelling_correction_pipeline pipeline T5Transformer from yurim111 +author: John Snow Labs +name: et5_spelling_correction_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`et5_spelling_correction_pipeline` is a English model originally trained by yurim111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/et5_spelling_correction_pipeline_en_5.5.1_3.0_1734302656373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/et5_spelling_correction_pipeline_en_5.5.1_3.0_1734302656373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("et5_spelling_correction_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("et5_spelling_correction_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|et5_spelling_correction_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yurim111/et5-spelling-correction + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_5epoch_en.md b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_5epoch_en.md new file mode 100644 index 00000000000000..a5ec6006d8a8c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_5epoch_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English final_32shots_twitter_skhead_train_5epoch MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: final_32shots_twitter_skhead_train_5epoch +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`final_32shots_twitter_skhead_train_5epoch` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_5epoch_en_5.5.1_3.0_1734306876375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_5epoch_en_5.5.1_3.0_1734306876375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("final_32shots_twitter_skhead_train_5epoch","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("final_32shots_twitter_skhead_train_5epoch","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|final_32shots_twitter_skhead_train_5epoch| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Nhat1904/Final-32shots-Twitter-Skhead-Train-5epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_5epoch_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_5epoch_pipeline_en.md new file mode 100644 index 00000000000000..c8939d577c2c31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_5epoch_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English final_32shots_twitter_skhead_train_5epoch_pipeline pipeline MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: final_32shots_twitter_skhead_train_5epoch_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`final_32shots_twitter_skhead_train_5epoch_pipeline` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_5epoch_pipeline_en_5.5.1_3.0_1734306897371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_5epoch_pipeline_en_5.5.1_3.0_1734306897371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("final_32shots_twitter_skhead_train_5epoch_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("final_32shots_twitter_skhead_train_5epoch_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|final_32shots_twitter_skhead_train_5epoch_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Nhat1904/Final-32shots-Twitter-Skhead-Train-5epoch + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_en.md b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_en.md new file mode 100644 index 00000000000000..bc20a4cc761a1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English final_32shots_twitter_skhead_train MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: final_32shots_twitter_skhead_train +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`final_32shots_twitter_skhead_train` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_en_5.5.1_3.0_1734307000733.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_en_5.5.1_3.0_1734307000733.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("final_32shots_twitter_skhead_train","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("final_32shots_twitter_skhead_train","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|final_32shots_twitter_skhead_train| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Nhat1904/Final-32shots-Twitter-Skhead-Train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_pipeline_en.md new file mode 100644 index 00000000000000..7f32d09beba512 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-final_32shots_twitter_skhead_train_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English final_32shots_twitter_skhead_train_pipeline pipeline MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: final_32shots_twitter_skhead_train_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`final_32shots_twitter_skhead_train_pipeline` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_pipeline_en_5.5.1_3.0_1734307021366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/final_32shots_twitter_skhead_train_pipeline_en_5.5.1_3.0_1734307021366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("final_32shots_twitter_skhead_train_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("final_32shots_twitter_skhead_train_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|final_32shots_twitter_skhead_train_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/Nhat1904/Final-32shots-Twitter-Skhead-Train + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_en.md new file mode 100644 index 00000000000000..a8e2e7bbdb4c50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English fine_tuned T5Transformer from supkon +author: John Snow Labs +name: fine_tuned +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned` is a English model originally trained by supkon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_en_5.5.1_3.0_1734301941854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_en_5.5.1_3.0_1734301941854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("fine_tuned","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("fine_tuned", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|519.8 MB| + +## References + +https://huggingface.co/supkon/fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_finroberta_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_finroberta_en.md new file mode 100644 index 00000000000000..0c99c5b70215a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_finroberta_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fine_tuned_finroberta RoBertaForSequenceClassification from kekunh +author: John Snow Labs +name: fine_tuned_finroberta +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_finroberta` is a English model originally trained by kekunh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_finroberta_en_5.5.1_3.0_1734287848366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_finroberta_en_5.5.1_3.0_1734287848366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fine_tuned_finroberta","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fine_tuned_finroberta", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_finroberta| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/kekunh/fine_tuned_finroberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_finroberta_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_finroberta_pipeline_en.md new file mode 100644 index 00000000000000..73dce3de0f51ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_finroberta_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fine_tuned_finroberta_pipeline pipeline RoBertaForSequenceClassification from kekunh +author: John Snow Labs +name: fine_tuned_finroberta_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_finroberta_pipeline` is a English model originally trained by kekunh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_finroberta_pipeline_en_5.5.1_3.0_1734287870375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_finroberta_pipeline_en_5.5.1_3.0_1734287870375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_finroberta_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_finroberta_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_finroberta_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/kekunh/fine_tuned_finroberta + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_pipeline_en.md new file mode 100644 index 00000000000000..b290468c742859 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English fine_tuned_pipeline pipeline T5Transformer from supkon +author: John Snow Labs +name: fine_tuned_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_pipeline` is a English model originally trained by supkon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_pipeline_en_5.5.1_3.0_1734302110686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_pipeline_en_5.5.1_3.0_1734302110686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|519.8 MB| + +## References + +https://huggingface.co/supkon/fine-tuned + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_rte_xlmroberta_lenatr99_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_rte_xlmroberta_lenatr99_en.md new file mode 100644 index 00000000000000..31f64779063a3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_rte_xlmroberta_lenatr99_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English fine_tuned_rte_xlmroberta_lenatr99 XlmRoBertaForSequenceClassification from lenatr99 +author: John Snow Labs +name: fine_tuned_rte_xlmroberta_lenatr99 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_rte_xlmroberta_lenatr99` is a English model originally trained by lenatr99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_rte_xlmroberta_lenatr99_en_5.5.1_3.0_1734293906491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_rte_xlmroberta_lenatr99_en_5.5.1_3.0_1734293906491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("fine_tuned_rte_xlmroberta_lenatr99","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("fine_tuned_rte_xlmroberta_lenatr99", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_rte_xlmroberta_lenatr99| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|799.4 MB| + +## References + +https://huggingface.co/lenatr99/fine_tuned_rte_XLMroberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_rte_xlmroberta_lenatr99_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_rte_xlmroberta_lenatr99_pipeline_en.md new file mode 100644 index 00000000000000..a69e6240dd1a4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_rte_xlmroberta_lenatr99_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English fine_tuned_rte_xlmroberta_lenatr99_pipeline pipeline XlmRoBertaForSequenceClassification from lenatr99 +author: John Snow Labs +name: fine_tuned_rte_xlmroberta_lenatr99_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_rte_xlmroberta_lenatr99_pipeline` is a English model originally trained by lenatr99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_rte_xlmroberta_lenatr99_pipeline_en_5.5.1_3.0_1734294032368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_rte_xlmroberta_lenatr99_pipeline_en_5.5.1_3.0_1734294032368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_rte_xlmroberta_lenatr99_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_rte_xlmroberta_lenatr99_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_rte_xlmroberta_lenatr99_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|799.4 MB| + +## References + +https://huggingface.co/lenatr99/fine_tuned_rte_XLMroberta + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_small_model_sec_5_v9_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_small_model_sec_5_v9_en.md new file mode 100644 index 00000000000000..698e8a94c36578 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_small_model_sec_5_v9_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v9 T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v9 +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v9` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v9_en_5.5.1_3.0_1734300981299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v9_en_5.5.1_3.0_1734300981299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("fine_tuned_t5_small_model_sec_5_v9","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("fine_tuned_t5_small_model_sec_5_v9", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v9| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|319.9 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_small_model_sec_5_v9_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_small_model_sec_5_v9_pipeline_en.md new file mode 100644 index 00000000000000..75259d9c305e4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_small_model_sec_5_v9_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v9_pipeline pipeline T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v9_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v9_pipeline` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v9_pipeline_en_5.5.1_3.0_1734301003703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v9_pipeline_en_5.5.1_3.0_1734301003703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v9_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v9_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v9_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|319.9 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v9 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_squad_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_squad_en.md new file mode 100644 index 00000000000000..c65009c541540c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_squad_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English fine_tuned_t5_squad T5Transformer from Drashtip +author: John Snow Labs +name: fine_tuned_t5_squad +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_squad` is a English model originally trained by Drashtip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_squad_en_5.5.1_3.0_1734300885032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_squad_en_5.5.1_3.0_1734300885032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("fine_tuned_t5_squad","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("fine_tuned_t5_squad", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_squad| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|330.2 MB| + +## References + +https://huggingface.co/Drashtip/fine_tuned_t5_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_squad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_squad_pipeline_en.md new file mode 100644 index 00000000000000..2a26f9707a94c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-fine_tuned_t5_squad_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English fine_tuned_t5_squad_pipeline pipeline T5Transformer from Drashtip +author: John Snow Labs +name: fine_tuned_t5_squad_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_squad_pipeline` is a English model originally trained by Drashtip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_squad_pipeline_en_5.5.1_3.0_1734300906768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_squad_pipeline_en_5.5.1_3.0_1734300906768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_t5_squad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_t5_squad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_squad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|330.2 MB| + +## References + +https://huggingface.co/Drashtip/fine_tuned_t5_squad + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetune_question_answer_thaiqa_pipeline_th.md b/docs/_posts/ahmedlone127/2024-12-15-finetune_question_answer_thaiqa_pipeline_th.md new file mode 100644 index 00000000000000..1b53d45dcbe01d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetune_question_answer_thaiqa_pipeline_th.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Thai finetune_question_answer_thaiqa_pipeline pipeline CamemBertForQuestionAnswering from phoner45 +author: John Snow Labs +name: finetune_question_answer_thaiqa_pipeline +date: 2024-12-15 +tags: [th, open_source, pipeline, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetune_question_answer_thaiqa_pipeline` is a Thai model originally trained by phoner45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetune_question_answer_thaiqa_pipeline_th_5.5.1_3.0_1734296091175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetune_question_answer_thaiqa_pipeline_th_5.5.1_3.0_1734296091175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetune_question_answer_thaiqa_pipeline", lang = "th") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetune_question_answer_thaiqa_pipeline", lang = "th") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetune_question_answer_thaiqa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|th| +|Size:|392.1 MB| + +## References + +https://huggingface.co/phoner45/finetune-Question-Answer-thaiqa + +## Included Models + +- MultiDocumentAssembler +- CamemBertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetune_question_answer_thaiqa_th.md b/docs/_posts/ahmedlone127/2024-12-15-finetune_question_answer_thaiqa_th.md new file mode 100644 index 00000000000000..77e5a6f6e3e377 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetune_question_answer_thaiqa_th.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Thai finetune_question_answer_thaiqa CamemBertForQuestionAnswering from phoner45 +author: John Snow Labs +name: finetune_question_answer_thaiqa +date: 2024-12-15 +tags: [th, open_source, onnx, question_answering, camembert] +task: Question Answering +language: th +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CamemBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CamemBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetune_question_answer_thaiqa` is a Thai model originally trained by phoner45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetune_question_answer_thaiqa_th_5.5.1_3.0_1734296071187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetune_question_answer_thaiqa_th_5.5.1_3.0_1734296071187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = CamemBertForQuestionAnswering.pretrained("finetune_question_answer_thaiqa","th") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = CamemBertForQuestionAnswering.pretrained("finetune_question_answer_thaiqa", "th") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetune_question_answer_thaiqa| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|th| +|Size:|392.1 MB| + +## References + +https://huggingface.co/phoner45/finetune-Question-Answer-thaiqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuned_embedding_v2_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuned_embedding_v2_en.md new file mode 100644 index 00000000000000..db38993b679d2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuned_embedding_v2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English finetuned_embedding_v2 MPNetEmbeddings from KayaAI +author: John Snow Labs +name: finetuned_embedding_v2 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_embedding_v2` is a English model originally trained by KayaAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_embedding_v2_en_5.5.1_3.0_1734305854974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_embedding_v2_en_5.5.1_3.0_1734305854974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("finetuned_embedding_v2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("finetuned_embedding_v2","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_embedding_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/KayaAI/finetuned_embedding_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuned_embedding_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuned_embedding_v2_pipeline_en.md new file mode 100644 index 00000000000000..5a6fc2d8c194dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuned_embedding_v2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English finetuned_embedding_v2_pipeline pipeline MPNetEmbeddings from KayaAI +author: John Snow Labs +name: finetuned_embedding_v2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_embedding_v2_pipeline` is a English model originally trained by KayaAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_embedding_v2_pipeline_en_5.5.1_3.0_1734305876235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_embedding_v2_pipeline_en_5.5.1_3.0_1734305876235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_embedding_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_embedding_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_embedding_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/KayaAI/finetuned_embedding_v2 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuned_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuned_en.md new file mode 100644 index 00000000000000..ee529a73565678 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuned BertEmbeddings from vppvgit +author: John Snow Labs +name: finetuned +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned` is a English model originally trained by vppvgit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_en_5.5.1_3.0_1734283947750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_en_5.5.1_3.0_1734283947750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("finetuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/vppvgit/Finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuned_nli_provenance_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuned_nli_provenance_en.md new file mode 100644 index 00000000000000..c81d901fa7b8db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuned_nli_provenance_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English finetuned_nli_provenance RoBertaForSequenceClassification from GuardrailsAI +author: John Snow Labs +name: finetuned_nli_provenance +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_nli_provenance` is a English model originally trained by GuardrailsAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_nli_provenance_en_5.5.1_3.0_1734288178297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_nli_provenance_en_5.5.1_3.0_1734288178297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_nli_provenance","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_nli_provenance", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_nli_provenance| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/GuardrailsAI/finetuned_nli_provenance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuned_nli_provenance_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuned_nli_provenance_pipeline_en.md new file mode 100644 index 00000000000000..c0f5a32aebed76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuned_nli_provenance_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_nli_provenance_pipeline pipeline RoBertaForSequenceClassification from GuardrailsAI +author: John Snow Labs +name: finetuned_nli_provenance_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_nli_provenance_pipeline` is a English model originally trained by GuardrailsAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_nli_provenance_pipeline_en_5.5.1_3.0_1734288251626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_nli_provenance_pipeline_en_5.5.1_3.0_1734288251626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_nli_provenance_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_nli_provenance_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_nli_provenance_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/GuardrailsAI/finetuned_nli_provenance + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuned_pipeline_en.md new file mode 100644 index 00000000000000..670b2c47ebf4eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English finetuned_pipeline pipeline BertEmbeddings from vppvgit +author: John Snow Labs +name: finetuned_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_pipeline` is a English model originally trained by vppvgit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_pipeline_en_5.5.1_3.0_1734283970027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_pipeline_en_5.5.1_3.0_1734283970027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|411.8 MB| + +## References + +https://huggingface.co/vppvgit/Finetuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuning_mbart_english_arabic_translation_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuning_mbart_english_arabic_translation_en.md new file mode 100644 index 00000000000000..08ca5eb33e4097 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuning_mbart_english_arabic_translation_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English finetuning_mbart_english_arabic_translation T5Transformer from ahmed792002 +author: John Snow Labs +name: finetuning_mbart_english_arabic_translation +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_mbart_english_arabic_translation` is a English model originally trained by ahmed792002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_mbart_english_arabic_translation_en_5.5.1_3.0_1734299070917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_mbart_english_arabic_translation_en_5.5.1_3.0_1734299070917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("finetuning_mbart_english_arabic_translation","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("finetuning_mbart_english_arabic_translation", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_mbart_english_arabic_translation| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|949.0 MB| + +## References + +https://huggingface.co/ahmed792002/Finetuning_MBart_English_Arabic_Translation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-finetuning_mbart_english_arabic_translation_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-finetuning_mbart_english_arabic_translation_pipeline_en.md new file mode 100644 index 00000000000000..ca8b77b44d8a0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-finetuning_mbart_english_arabic_translation_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English finetuning_mbart_english_arabic_translation_pipeline pipeline T5Transformer from ahmed792002 +author: John Snow Labs +name: finetuning_mbart_english_arabic_translation_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_mbart_english_arabic_translation_pipeline` is a English model originally trained by ahmed792002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_mbart_english_arabic_translation_pipeline_en_5.5.1_3.0_1734299133848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_mbart_english_arabic_translation_pipeline_en_5.5.1_3.0_1734299133848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuning_mbart_english_arabic_translation_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuning_mbart_english_arabic_translation_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_mbart_english_arabic_translation_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|949.0 MB| + +## References + +https://huggingface.co/ahmed792002/Finetuning_MBart_English_Arabic_Translation + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_lora_wind_energy_v4_1_advanced_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_lora_wind_energy_v4_1_advanced_en.md new file mode 100644 index 00000000000000..91a56fd52659f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_lora_wind_energy_v4_1_advanced_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_base_lora_wind_energy_v4_1_advanced T5Transformer from nell123 +author: John Snow Labs +name: flan_t5_base_lora_wind_energy_v4_1_advanced +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_base_lora_wind_energy_v4_1_advanced` is a English model originally trained by nell123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_base_lora_wind_energy_v4_1_advanced_en_5.5.1_3.0_1734301733348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_base_lora_wind_energy_v4_1_advanced_en_5.5.1_3.0_1734301733348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_base_lora_wind_energy_v4_1_advanced","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_base_lora_wind_energy_v4_1_advanced", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_base_lora_wind_energy_v4_1_advanced| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/nell123/flan_t5_base-lora_wind_energy-v4.1_advanced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline_en.md new file mode 100644 index 00000000000000..1051e15b39c6a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline pipeline T5Transformer from nell123 +author: John Snow Labs +name: flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline` is a English model originally trained by nell123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline_en_5.5.1_3.0_1734301788119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline_en_5.5.1_3.0_1734301788119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_base_lora_wind_energy_v4_1_advanced_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/nell123/flan_t5_base-lora_wind_energy-v4.1_advanced + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_multilingual_sentiment_classification_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_multilingual_sentiment_classification_pipeline_xx.md new file mode 100644 index 00000000000000..c14efcabb335c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_multilingual_sentiment_classification_pipeline_xx.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Multilingual flan_t5_base_multilingual_sentiment_classification_pipeline pipeline T5Transformer from Aryanpro321 +author: John Snow Labs +name: flan_t5_base_multilingual_sentiment_classification_pipeline +date: 2024-12-15 +tags: [xx, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_base_multilingual_sentiment_classification_pipeline` is a Multilingual model originally trained by Aryanpro321. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_base_multilingual_sentiment_classification_pipeline_xx_5.5.1_3.0_1734300256055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_base_multilingual_sentiment_classification_pipeline_xx_5.5.1_3.0_1734300256055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_base_multilingual_sentiment_classification_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_base_multilingual_sentiment_classification_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_base_multilingual_sentiment_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Aryanpro321/flan-t5-base-multilingual-sentiment-classification + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_multilingual_sentiment_classification_xx.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_multilingual_sentiment_classification_xx.md new file mode 100644 index 00000000000000..75ddb865632200 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_multilingual_sentiment_classification_xx.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Multilingual flan_t5_base_multilingual_sentiment_classification T5Transformer from Aryanpro321 +author: John Snow Labs +name: flan_t5_base_multilingual_sentiment_classification +date: 2024-12-15 +tags: [xx, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_base_multilingual_sentiment_classification` is a Multilingual model originally trained by Aryanpro321. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_base_multilingual_sentiment_classification_xx_5.5.1_3.0_1734300205006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_base_multilingual_sentiment_classification_xx_5.5.1_3.0_1734300205006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_base_multilingual_sentiment_classification","xx") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_base_multilingual_sentiment_classification", "xx") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_base_multilingual_sentiment_classification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|xx| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Aryanpro321/flan-t5-base-multilingual-sentiment-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_paragrapher_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_paragrapher_en.md new file mode 100644 index 00000000000000..3dc22f167d9063 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_paragrapher_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_base_paragrapher T5Transformer from agentlans +author: John Snow Labs +name: flan_t5_base_paragrapher +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_base_paragrapher` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_base_paragrapher_en_5.5.1_3.0_1734299249095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_base_paragrapher_en_5.5.1_3.0_1734299249095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_base_paragrapher","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_base_paragrapher", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_base_paragrapher| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/agentlans/flan-t5-base-paragrapher \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_paragrapher_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_paragrapher_pipeline_en.md new file mode 100644 index 00000000000000..6257ff47afcbc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_base_paragrapher_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_base_paragrapher_pipeline pipeline T5Transformer from agentlans +author: John Snow Labs +name: flan_t5_base_paragrapher_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_base_paragrapher_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_base_paragrapher_pipeline_en_5.5.1_3.0_1734299299486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_base_paragrapher_pipeline_en_5.5.1_3.0_1734299299486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_base_paragrapher_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_base_paragrapher_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_base_paragrapher_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/agentlans/flan-t5-base-paragrapher + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_rouge_durga_q5_clean_4f_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_rouge_durga_q5_clean_4f_en.md new file mode 100644 index 00000000000000..620c86250d8083 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_rouge_durga_q5_clean_4f_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_4f T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_4f +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_4f` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4f_en_5.5.1_3.0_1734301851615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4f_en_5.5.1_3.0_1734301851615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_4f","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_4f", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_4f| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-4f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_rouge_durga_q5_clean_4f_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_rouge_durga_q5_clean_4f_pipeline_en.md new file mode 100644 index 00000000000000..6e29ae7d80ea41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_rouge_durga_q5_clean_4f_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_4f_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_4f_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_4f_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4f_pipeline_en_5.5.1_3.0_1734301901109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4f_pipeline_en_5.5.1_3.0_1734301901109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_rouge_durga_q5_clean_4f_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_rouge_durga_q5_clean_4f_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_4f_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-4f + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_finetuned_adarsh_12_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_finetuned_adarsh_12_en.md new file mode 100644 index 00000000000000..300a539bcab30e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_finetuned_adarsh_12_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_small_finetuned_adarsh_12 T5Transformer from Adarsh-12 +author: John Snow Labs +name: flan_t5_small_finetuned_adarsh_12 +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_finetuned_adarsh_12` is a English model originally trained by Adarsh-12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_finetuned_adarsh_12_en_5.5.1_3.0_1734301458213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_finetuned_adarsh_12_en_5.5.1_3.0_1734301458213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_small_finetuned_adarsh_12","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_small_finetuned_adarsh_12", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_finetuned_adarsh_12| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/Adarsh-12/flan-t5-small-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_finetuned_adarsh_12_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_finetuned_adarsh_12_pipeline_en.md new file mode 100644 index 00000000000000..ee8181b9107f8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_finetuned_adarsh_12_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_small_finetuned_adarsh_12_pipeline pipeline T5Transformer from Adarsh-12 +author: John Snow Labs +name: flan_t5_small_finetuned_adarsh_12_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_finetuned_adarsh_12_pipeline` is a English model originally trained by Adarsh-12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_finetuned_adarsh_12_pipeline_en_5.5.1_3.0_1734301475967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_finetuned_adarsh_12_pipeline_en_5.5.1_3.0_1734301475967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_small_finetuned_adarsh_12_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_small_finetuned_adarsh_12_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_finetuned_adarsh_12_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/Adarsh-12/flan-t5-small-finetuned + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_newsqa_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_newsqa_en.md new file mode 100644 index 00000000000000..2a4ad8e02e6faa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_newsqa_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_small_newsqa T5Transformer from Pavan48 +author: John Snow Labs +name: flan_t5_small_newsqa +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_newsqa` is a English model originally trained by Pavan48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_newsqa_en_5.5.1_3.0_1734302444683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_newsqa_en_5.5.1_3.0_1734302444683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_small_newsqa","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_small_newsqa", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_newsqa| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/Pavan48/flan_t5_small_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_newsqa_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_newsqa_pipeline_en.md new file mode 100644 index 00000000000000..33dfc2a1e8d832 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_newsqa_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_small_newsqa_pipeline pipeline T5Transformer from Pavan48 +author: John Snow Labs +name: flan_t5_small_newsqa_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_newsqa_pipeline` is a English model originally trained by Pavan48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_newsqa_pipeline_en_5.5.1_3.0_1734302462628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_newsqa_pipeline_en_5.5.1_3.0_1734302462628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_small_newsqa_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_small_newsqa_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_newsqa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/Pavan48/flan_t5_small_newsqa + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_search_query_generation_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_search_query_generation_en.md new file mode 100644 index 00000000000000..d1727b889240d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_search_query_generation_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_small_search_query_generation T5Transformer from 1rsh +author: John Snow Labs +name: flan_t5_small_search_query_generation +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_search_query_generation` is a English model originally trained by 1rsh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_search_query_generation_en_5.5.1_3.0_1734299816192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_search_query_generation_en_5.5.1_3.0_1734299816192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_small_search_query_generation","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_small_search_query_generation", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_search_query_generation| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/1rsh/flan-t5-small-search-query-generation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_search_query_generation_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_search_query_generation_pipeline_en.md new file mode 100644 index 00000000000000..00aa7dd466029b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flan_t5_small_search_query_generation_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_small_search_query_generation_pipeline pipeline T5Transformer from 1rsh +author: John Snow Labs +name: flan_t5_small_search_query_generation_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_search_query_generation_pipeline` is a English model originally trained by 1rsh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_search_query_generation_pipeline_en_5.5.1_3.0_1734299834394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_search_query_generation_pipeline_en_5.5.1_3.0_1734299834394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_small_search_query_generation_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_small_search_query_generation_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_search_query_generation_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/1rsh/flan-t5-small-search-query-generation + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_base_en.md b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_base_en.md new file mode 100644 index 00000000000000..e5ae82cc3b3486 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_base_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flowertune_llm_google_t5_base T5Transformer from layonsan +author: John Snow Labs +name: flowertune_llm_google_t5_base +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flowertune_llm_google_t5_base` is a English model originally trained by layonsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_base_en_5.5.1_3.0_1734301256334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_base_en_5.5.1_3.0_1734301256334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flowertune_llm_google_t5_base","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flowertune_llm_google_t5_base", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flowertune_llm_google_t5_base| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|594.4 MB| + +## References + +https://huggingface.co/layonsan/flowertune-llm-google-t5-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_base_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_base_pipeline_en.md new file mode 100644 index 00000000000000..f49f833d5fd959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_base_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flowertune_llm_google_t5_base_pipeline pipeline T5Transformer from layonsan +author: John Snow Labs +name: flowertune_llm_google_t5_base_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flowertune_llm_google_t5_base_pipeline` is a English model originally trained by layonsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_base_pipeline_en_5.5.1_3.0_1734301419991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_base_pipeline_en_5.5.1_3.0_1734301419991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flowertune_llm_google_t5_base_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flowertune_llm_google_t5_base_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flowertune_llm_google_t5_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|594.4 MB| + +## References + +https://huggingface.co/layonsan/flowertune-llm-google-t5-base + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_small_en.md b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_small_en.md new file mode 100644 index 00000000000000..65ed3f065be7ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_small_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flowertune_llm_google_t5_small T5Transformer from layonsan +author: John Snow Labs +name: flowertune_llm_google_t5_small +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flowertune_llm_google_t5_small` is a English model originally trained by layonsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_small_en_5.5.1_3.0_1734299566110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_small_en_5.5.1_3.0_1734299566110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flowertune_llm_google_t5_small","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flowertune_llm_google_t5_small", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flowertune_llm_google_t5_small| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|594.4 MB| + +## References + +https://huggingface.co/layonsan/flowertune-llm-google-t5-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_small_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_small_pipeline_en.md new file mode 100644 index 00000000000000..421cc016720bdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-flowertune_llm_google_t5_small_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flowertune_llm_google_t5_small_pipeline pipeline T5Transformer from layonsan +author: John Snow Labs +name: flowertune_llm_google_t5_small_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flowertune_llm_google_t5_small_pipeline` is a English model originally trained by layonsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_small_pipeline_en_5.5.1_3.0_1734299716663.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flowertune_llm_google_t5_small_pipeline_en_5.5.1_3.0_1734299716663.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flowertune_llm_google_t5_small_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flowertune_llm_google_t5_small_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flowertune_llm_google_t5_small_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|594.4 MB| + +## References + +https://huggingface.co/layonsan/flowertune-llm-google-t5-small + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-ft_exlmr_am.md b/docs/_posts/ahmedlone127/2024-12-15-ft_exlmr_am.md new file mode 100644 index 00000000000000..d6aab4595ad75c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-ft_exlmr_am.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Amharic ft_exlmr XlmRoBertaForSequenceClassification from Hailay +author: John Snow Labs +name: ft_exlmr +date: 2024-12-15 +tags: [am, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: am +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ft_exlmr` is a Amharic model originally trained by Hailay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ft_exlmr_am_5.5.1_3.0_1734292681516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ft_exlmr_am_5.5.1_3.0_1734292681516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("ft_exlmr","am") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("ft_exlmr", "am") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ft_exlmr| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|am| +|Size:|907.8 MB| + +## References + +https://huggingface.co/Hailay/FT_EXLMR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-ft_exlmr_pipeline_am.md b/docs/_posts/ahmedlone127/2024-12-15-ft_exlmr_pipeline_am.md new file mode 100644 index 00000000000000..043efc3cb001c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-ft_exlmr_pipeline_am.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Amharic ft_exlmr_pipeline pipeline XlmRoBertaForSequenceClassification from Hailay +author: John Snow Labs +name: ft_exlmr_pipeline +date: 2024-12-15 +tags: [am, open_source, pipeline, onnx] +task: Text Classification +language: am +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ft_exlmr_pipeline` is a Amharic model originally trained by Hailay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ft_exlmr_pipeline_am_5.5.1_3.0_1734292783815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ft_exlmr_pipeline_am_5.5.1_3.0_1734292783815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ft_exlmr_pipeline", lang = "am") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ft_exlmr_pipeline", lang = "am") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ft_exlmr_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|am| +|Size:|907.8 MB| + +## References + +https://huggingface.co/Hailay/FT_EXLMR + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-furina_seed42_eng_amh_esp_basic_en.md b/docs/_posts/ahmedlone127/2024-12-15-furina_seed42_eng_amh_esp_basic_en.md new file mode 100644 index 00000000000000..db5e4497565dc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-furina_seed42_eng_amh_esp_basic_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English furina_seed42_eng_amh_esp_basic XlmRoBertaForSequenceClassification from Shijia +author: John Snow Labs +name: furina_seed42_eng_amh_esp_basic +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`furina_seed42_eng_amh_esp_basic` is a English model originally trained by Shijia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/furina_seed42_eng_amh_esp_basic_en_5.5.1_3.0_1734292134546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/furina_seed42_eng_amh_esp_basic_en_5.5.1_3.0_1734292134546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("furina_seed42_eng_amh_esp_basic","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("furina_seed42_eng_amh_esp_basic", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|furina_seed42_eng_amh_esp_basic| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/Shijia/furina_seed42_eng_amh_esp_basic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-furina_seed42_eng_amh_esp_basic_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-furina_seed42_eng_amh_esp_basic_pipeline_en.md new file mode 100644 index 00000000000000..ae67da0d635d22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-furina_seed42_eng_amh_esp_basic_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English furina_seed42_eng_amh_esp_basic_pipeline pipeline XlmRoBertaForSequenceClassification from Shijia +author: John Snow Labs +name: furina_seed42_eng_amh_esp_basic_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`furina_seed42_eng_amh_esp_basic_pipeline` is a English model originally trained by Shijia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/furina_seed42_eng_amh_esp_basic_pipeline_en_5.5.1_3.0_1734292209151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/furina_seed42_eng_amh_esp_basic_pipeline_en_5.5.1_3.0_1734292209151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("furina_seed42_eng_amh_esp_basic_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("furina_seed42_eng_amh_esp_basic_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|furina_seed42_eng_amh_esp_basic_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/Shijia/furina_seed42_eng_amh_esp_basic + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-generative_question_en.md b/docs/_posts/ahmedlone127/2024-12-15-generative_question_en.md new file mode 100644 index 00000000000000..e37de7a42c9484 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-generative_question_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English generative_question T5Transformer from Kais4rx +author: John Snow Labs +name: generative_question +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`generative_question` is a English model originally trained by Kais4rx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/generative_question_en_5.5.1_3.0_1734301591049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/generative_question_en_5.5.1_3.0_1734301591049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("generative_question","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("generative_question", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|generative_question| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|342.0 MB| + +## References + +https://huggingface.co/Kais4rx/generative_question \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-generative_question_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-generative_question_pipeline_en.md new file mode 100644 index 00000000000000..f466e2658d6848 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-generative_question_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English generative_question_pipeline pipeline T5Transformer from Kais4rx +author: John Snow Labs +name: generative_question_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`generative_question_pipeline` is a English model originally trained by Kais4rx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/generative_question_pipeline_en_5.5.1_3.0_1734301609807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/generative_question_pipeline_en_5.5.1_3.0_1734301609807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("generative_question_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("generative_question_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|generative_question_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|342.0 MB| + +## References + +https://huggingface.co/Kais4rx/generative_question + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-google_t5_efficient_mini_n12_newssummary_en.md b/docs/_posts/ahmedlone127/2024-12-15-google_t5_efficient_mini_n12_newssummary_en.md new file mode 100644 index 00000000000000..ea3ba1b296afa8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-google_t5_efficient_mini_n12_newssummary_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English google_t5_efficient_mini_n12_newssummary T5Transformer from shorecode +author: John Snow Labs +name: google_t5_efficient_mini_n12_newssummary +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`google_t5_efficient_mini_n12_newssummary` is a English model originally trained by shorecode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/google_t5_efficient_mini_n12_newssummary_en_5.5.1_3.0_1734301770854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/google_t5_efficient_mini_n12_newssummary_en_5.5.1_3.0_1734301770854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("google_t5_efficient_mini_n12_newssummary","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("google_t5_efficient_mini_n12_newssummary", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|google_t5_efficient_mini_n12_newssummary| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|102.8 MB| + +## References + +https://huggingface.co/shorecode/google-t5-efficient-mini-n12-newssummary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-google_t5_efficient_mini_n12_newssummary_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-google_t5_efficient_mini_n12_newssummary_pipeline_en.md new file mode 100644 index 00000000000000..4c25815388e345 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-google_t5_efficient_mini_n12_newssummary_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English google_t5_efficient_mini_n12_newssummary_pipeline pipeline T5Transformer from shorecode +author: John Snow Labs +name: google_t5_efficient_mini_n12_newssummary_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`google_t5_efficient_mini_n12_newssummary_pipeline` is a English model originally trained by shorecode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/google_t5_efficient_mini_n12_newssummary_pipeline_en_5.5.1_3.0_1734301776575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/google_t5_efficient_mini_n12_newssummary_pipeline_en_5.5.1_3.0_1734301776575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("google_t5_efficient_mini_n12_newssummary_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("google_t5_efficient_mini_n12_newssummary_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|google_t5_efficient_mini_n12_newssummary_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|102.8 MB| + +## References + +https://huggingface.co/shorecode/google-t5-efficient-mini-n12-newssummary + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-h2_keywordextractor_en.md b/docs/_posts/ahmedlone127/2024-12-15-h2_keywordextractor_en.md new file mode 100644 index 00000000000000..7ce36c9cd70ae8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-h2_keywordextractor_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English h2_keywordextractor BartTransformer from transformer3 +author: John Snow Labs +name: h2_keywordextractor +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`h2_keywordextractor` is a English model originally trained by transformer3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/h2_keywordextractor_en_5.5.1_3.0_1734303340085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/h2_keywordextractor_en_5.5.1_3.0_1734303340085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("h2_keywordextractor","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("h2_keywordextractor","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|h2_keywordextractor| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/transformer3/H2-keywordextractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-h2_keywordextractor_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-h2_keywordextractor_pipeline_en.md new file mode 100644 index 00000000000000..25f3758059b154 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-h2_keywordextractor_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English h2_keywordextractor_pipeline pipeline BartTransformer from transformer3 +author: John Snow Labs +name: h2_keywordextractor_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`h2_keywordextractor_pipeline` is a English model originally trained by transformer3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/h2_keywordextractor_pipeline_en_5.5.1_3.0_1734303430574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/h2_keywordextractor_pipeline_en_5.5.1_3.0_1734303430574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("h2_keywordextractor_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("h2_keywordextractor_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|h2_keywordextractor_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/transformer3/H2-keywordextractor + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-intent_analysis_3labels_v1_en.md b/docs/_posts/ahmedlone127/2024-12-15-intent_analysis_3labels_v1_en.md new file mode 100644 index 00000000000000..04af64052a0f70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-intent_analysis_3labels_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English intent_analysis_3labels_v1 XlmRoBertaForSequenceClassification from adriansanz +author: John Snow Labs +name: intent_analysis_3labels_v1 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_analysis_3labels_v1` is a English model originally trained by adriansanz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_analysis_3labels_v1_en_5.5.1_3.0_1734292960315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_analysis_3labels_v1_en_5.5.1_3.0_1734292960315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("intent_analysis_3labels_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("intent_analysis_3labels_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_analysis_3labels_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|779.2 MB| + +## References + +https://huggingface.co/adriansanz/intent_analysis_3labels_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-intent_analysis_3labels_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-intent_analysis_3labels_v1_pipeline_en.md new file mode 100644 index 00000000000000..a59e28b235f6f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-intent_analysis_3labels_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English intent_analysis_3labels_v1_pipeline pipeline XlmRoBertaForSequenceClassification from adriansanz +author: John Snow Labs +name: intent_analysis_3labels_v1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_analysis_3labels_v1_pipeline` is a English model originally trained by adriansanz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_analysis_3labels_v1_pipeline_en_5.5.1_3.0_1734293099045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_analysis_3labels_v1_pipeline_en_5.5.1_3.0_1734293099045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("intent_analysis_3labels_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("intent_analysis_3labels_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_analysis_3labels_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|779.2 MB| + +## References + +https://huggingface.co/adriansanz/intent_analysis_3labels_v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-kazparc_russian_english_model_1_astersignature_en.md b/docs/_posts/ahmedlone127/2024-12-15-kazparc_russian_english_model_1_astersignature_en.md new file mode 100644 index 00000000000000..85f8e4a869e4bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-kazparc_russian_english_model_1_astersignature_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English kazparc_russian_english_model_1_astersignature T5Transformer from astersignature +author: John Snow Labs +name: kazparc_russian_english_model_1_astersignature +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazparc_russian_english_model_1_astersignature` is a English model originally trained by astersignature. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_astersignature_en_5.5.1_3.0_1734302567210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_astersignature_en_5.5.1_3.0_1734302567210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("kazparc_russian_english_model_1_astersignature","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("kazparc_russian_english_model_1_astersignature", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazparc_russian_english_model_1_astersignature| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|342.1 MB| + +## References + +https://huggingface.co/astersignature/kazparc_ru_en_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-kazparc_russian_english_model_1_astersignature_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-kazparc_russian_english_model_1_astersignature_pipeline_en.md new file mode 100644 index 00000000000000..bb0404a31ff4a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-kazparc_russian_english_model_1_astersignature_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English kazparc_russian_english_model_1_astersignature_pipeline pipeline T5Transformer from astersignature +author: John Snow Labs +name: kazparc_russian_english_model_1_astersignature_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazparc_russian_english_model_1_astersignature_pipeline` is a English model originally trained by astersignature. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_astersignature_pipeline_en_5.5.1_3.0_1734302585985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_astersignature_pipeline_en_5.5.1_3.0_1734302585985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("kazparc_russian_english_model_1_astersignature_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("kazparc_russian_english_model_1_astersignature_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazparc_russian_english_model_1_astersignature_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|342.1 MB| + +## References + +https://huggingface.co/astersignature/kazparc_ru_en_model_1 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-keyword_summarizer_2000_v1_en.md b/docs/_posts/ahmedlone127/2024-12-15-keyword_summarizer_2000_v1_en.md new file mode 100644 index 00000000000000..b7e4297bc27a02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-keyword_summarizer_2000_v1_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English keyword_summarizer_2000_v1 T5Transformer from ZephyrUtopia +author: John Snow Labs +name: keyword_summarizer_2000_v1 +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_summarizer_2000_v1` is a English model originally trained by ZephyrUtopia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_summarizer_2000_v1_en_5.5.1_3.0_1734301251351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_summarizer_2000_v1_en_5.5.1_3.0_1734301251351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("keyword_summarizer_2000_v1","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("keyword_summarizer_2000_v1", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_summarizer_2000_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ZephyrUtopia/keyword-summarizer-2000-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-keyword_summarizer_2000_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-keyword_summarizer_2000_v1_pipeline_en.md new file mode 100644 index 00000000000000..facb292720650e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-keyword_summarizer_2000_v1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English keyword_summarizer_2000_v1_pipeline pipeline T5Transformer from ZephyrUtopia +author: John Snow Labs +name: keyword_summarizer_2000_v1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_summarizer_2000_v1_pipeline` is a English model originally trained by ZephyrUtopia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_summarizer_2000_v1_pipeline_en_5.5.1_3.0_1734301316870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_summarizer_2000_v1_pipeline_en_5.5.1_3.0_1734301316870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("keyword_summarizer_2000_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("keyword_summarizer_2000_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_summarizer_2000_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ZephyrUtopia/keyword-summarizer-2000-v1 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-language_modeling_from_scratch_malayalam_en.md b/docs/_posts/ahmedlone127/2024-12-15-language_modeling_from_scratch_malayalam_en.md new file mode 100644 index 00000000000000..f9f55bec51708e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-language_modeling_from_scratch_malayalam_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English language_modeling_from_scratch_malayalam BertEmbeddings from Tural +author: John Snow Labs +name: language_modeling_from_scratch_malayalam +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`language_modeling_from_scratch_malayalam` is a English model originally trained by Tural. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/language_modeling_from_scratch_malayalam_en_5.5.1_3.0_1734283781089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/language_modeling_from_scratch_malayalam_en_5.5.1_3.0_1734283781089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("language_modeling_from_scratch_malayalam","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("language_modeling_from_scratch_malayalam","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|language_modeling_from_scratch_malayalam| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/Tural/language-modeling-from-scratch-ml \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-language_modeling_from_scratch_malayalam_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-language_modeling_from_scratch_malayalam_pipeline_en.md new file mode 100644 index 00000000000000..238e0528d699f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-language_modeling_from_scratch_malayalam_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English language_modeling_from_scratch_malayalam_pipeline pipeline BertEmbeddings from Tural +author: John Snow Labs +name: language_modeling_from_scratch_malayalam_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`language_modeling_from_scratch_malayalam_pipeline` is a English model originally trained by Tural. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/language_modeling_from_scratch_malayalam_pipeline_en_5.5.1_3.0_1734283803778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/language_modeling_from_scratch_malayalam_pipeline_en_5.5.1_3.0_1734283803778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("language_modeling_from_scratch_malayalam_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("language_modeling_from_scratch_malayalam_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|language_modeling_from_scratch_malayalam_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/Tural/language-modeling-from-scratch-ml + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-lasto4_en.md b/docs/_posts/ahmedlone127/2024-12-15-lasto4_en.md new file mode 100644 index 00000000000000..f616167410cef8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-lasto4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English lasto4 XlmRoBertaForSequenceClassification from afiqlol +author: John Snow Labs +name: lasto4 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lasto4` is a English model originally trained by afiqlol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lasto4_en_5.5.1_3.0_1734293055478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lasto4_en_5.5.1_3.0_1734293055478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("lasto4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("lasto4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lasto4| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/afiqlol/lasto4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-lasto4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-lasto4_pipeline_en.md new file mode 100644 index 00000000000000..cca1418a62c4e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-lasto4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English lasto4_pipeline pipeline XlmRoBertaForSequenceClassification from afiqlol +author: John Snow Labs +name: lasto4_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lasto4_pipeline` is a English model originally trained by afiqlol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lasto4_pipeline_en_5.5.1_3.0_1734293113182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lasto4_pipeline_en_5.5.1_3.0_1734293113182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("lasto4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("lasto4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lasto4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/afiqlol/lasto4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-leia_multilingual_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-12-15-leia_multilingual_pipeline_xx.md new file mode 100644 index 00000000000000..1820a6a344d806 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-leia_multilingual_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual leia_multilingual_pipeline pipeline XlmRoBertaForSequenceClassification from LEIA +author: John Snow Labs +name: leia_multilingual_pipeline +date: 2024-12-15 +tags: [xx, open_source, pipeline, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`leia_multilingual_pipeline` is a Multilingual model originally trained by LEIA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/leia_multilingual_pipeline_xx_5.5.1_3.0_1734291856497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/leia_multilingual_pipeline_xx_5.5.1_3.0_1734291856497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("leia_multilingual_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("leia_multilingual_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|leia_multilingual_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|887.6 MB| + +## References + +https://huggingface.co/LEIA/LEIA-multilingual + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-leia_multilingual_xx.md b/docs/_posts/ahmedlone127/2024-12-15-leia_multilingual_xx.md new file mode 100644 index 00000000000000..e32877f73655e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-leia_multilingual_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual leia_multilingual XlmRoBertaForSequenceClassification from LEIA +author: John Snow Labs +name: leia_multilingual +date: 2024-12-15 +tags: [xx, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`leia_multilingual` is a Multilingual model originally trained by LEIA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/leia_multilingual_xx_5.5.1_3.0_1734291762162.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/leia_multilingual_xx_5.5.1_3.0_1734291762162.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("leia_multilingual","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("leia_multilingual", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|leia_multilingual| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|887.6 MB| + +## References + +https://huggingface.co/LEIA/LEIA-multilingual \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-lithuanian_patent_inventor_linking_en.md b/docs/_posts/ahmedlone127/2024-12-15-lithuanian_patent_inventor_linking_en.md new file mode 100644 index 00000000000000..56f723ab0d6228 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-lithuanian_patent_inventor_linking_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English lithuanian_patent_inventor_linking MPNetEmbeddings from gbpatentdata +author: John Snow Labs +name: lithuanian_patent_inventor_linking +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lithuanian_patent_inventor_linking` is a English model originally trained by gbpatentdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lithuanian_patent_inventor_linking_en_5.5.1_3.0_1734306239926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lithuanian_patent_inventor_linking_en_5.5.1_3.0_1734306239926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("lithuanian_patent_inventor_linking","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("lithuanian_patent_inventor_linking","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lithuanian_patent_inventor_linking| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/gbpatentdata/lt-patent-inventor-linking \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-lithuanian_patent_inventor_linking_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-lithuanian_patent_inventor_linking_pipeline_en.md new file mode 100644 index 00000000000000..90aa166cccc5c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-lithuanian_patent_inventor_linking_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English lithuanian_patent_inventor_linking_pipeline pipeline MPNetEmbeddings from gbpatentdata +author: John Snow Labs +name: lithuanian_patent_inventor_linking_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lithuanian_patent_inventor_linking_pipeline` is a English model originally trained by gbpatentdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lithuanian_patent_inventor_linking_pipeline_en_5.5.1_3.0_1734306263305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lithuanian_patent_inventor_linking_pipeline_en_5.5.1_3.0_1734306263305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("lithuanian_patent_inventor_linking_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("lithuanian_patent_inventor_linking_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lithuanian_patent_inventor_linking_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/gbpatentdata/lt-patent-inventor-linking + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mask_distilburt_finetuned_imdb_en.md b/docs/_posts/ahmedlone127/2024-12-15-mask_distilburt_finetuned_imdb_en.md new file mode 100644 index 00000000000000..ca716922f65ce0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mask_distilburt_finetuned_imdb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mask_distilburt_finetuned_imdb DistilBertEmbeddings from Faizyhugging +author: John Snow Labs +name: mask_distilburt_finetuned_imdb +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mask_distilburt_finetuned_imdb` is a English model originally trained by Faizyhugging. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mask_distilburt_finetuned_imdb_en_5.5.1_3.0_1734289916120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mask_distilburt_finetuned_imdb_en_5.5.1_3.0_1734289916120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("mask_distilburt_finetuned_imdb","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("mask_distilburt_finetuned_imdb","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mask_distilburt_finetuned_imdb| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Faizyhugging/Mask-distilburt-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mask_distilburt_finetuned_imdb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mask_distilburt_finetuned_imdb_pipeline_en.md new file mode 100644 index 00000000000000..c9c5687fd519f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mask_distilburt_finetuned_imdb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mask_distilburt_finetuned_imdb_pipeline pipeline DistilBertEmbeddings from Faizyhugging +author: John Snow Labs +name: mask_distilburt_finetuned_imdb_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mask_distilburt_finetuned_imdb_pipeline` is a English model originally trained by Faizyhugging. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mask_distilburt_finetuned_imdb_pipeline_en_5.5.1_3.0_1734289931323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mask_distilburt_finetuned_imdb_pipeline_en_5.5.1_3.0_1734289931323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mask_distilburt_finetuned_imdb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mask_distilburt_finetuned_imdb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mask_distilburt_finetuned_imdb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Faizyhugging/Mask-distilburt-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-meeting_summary_knkarthick_en.md b/docs/_posts/ahmedlone127/2024-12-15-meeting_summary_knkarthick_en.md new file mode 100644 index 00000000000000..c0d34a99df946f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-meeting_summary_knkarthick_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English meeting_summary_knkarthick BartTransformer from knkarthick +author: John Snow Labs +name: meeting_summary_knkarthick +date: 2024-12-15 +tags: [en, open_source, onnx, text_generation, bart] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BartTransformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`meeting_summary_knkarthick` is a English model originally trained by knkarthick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/meeting_summary_knkarthick_en_5.5.1_3.0_1734304996322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/meeting_summary_knkarthick_en_5.5.1_3.0_1734304996322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +seq2seq = BartTransformer.pretrained("meeting_summary_knkarthick","en") \ + .setInputCols(["documents"]) \ + .setOutputCol("generation") + +pipeline = Pipeline().setStages([documentAssembler, seq2seq]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val seq2seq = BartTransformer.pretrained("meeting_summary_knkarthick","en") + .setInputCols(Array("documents")) + .setOutputCol("generation") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|meeting_summary_knkarthick| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/knkarthick/MEETING_SUMMARY \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-meeting_summary_knkarthick_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-meeting_summary_knkarthick_pipeline_en.md new file mode 100644 index 00000000000000..0f5e025df72697 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-meeting_summary_knkarthick_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English meeting_summary_knkarthick_pipeline pipeline BartTransformer from knkarthick +author: John Snow Labs +name: meeting_summary_knkarthick_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BartTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`meeting_summary_knkarthick_pipeline` is a English model originally trained by knkarthick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/meeting_summary_knkarthick_pipeline_en_5.5.1_3.0_1734305094994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/meeting_summary_knkarthick_pipeline_en_5.5.1_3.0_1734305094994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("meeting_summary_knkarthick_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("meeting_summary_knkarthick_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|meeting_summary_knkarthick_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.9 GB| + +## References + +https://huggingface.co/knkarthick/MEETING_SUMMARY + +## Included Models + +- DocumentAssembler +- BartTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-microsoft_codebert_base_finetuned_defect_cwe_group_detection_en.md b/docs/_posts/ahmedlone127/2024-12-15-microsoft_codebert_base_finetuned_defect_cwe_group_detection_en.md new file mode 100644 index 00000000000000..cd90a2c22fab93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-microsoft_codebert_base_finetuned_defect_cwe_group_detection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English microsoft_codebert_base_finetuned_defect_cwe_group_detection RoBertaForSequenceClassification from mcanoglu +author: John Snow Labs +name: microsoft_codebert_base_finetuned_defect_cwe_group_detection +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`microsoft_codebert_base_finetuned_defect_cwe_group_detection` is a English model originally trained by mcanoglu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/microsoft_codebert_base_finetuned_defect_cwe_group_detection_en_5.5.1_3.0_1734287574099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/microsoft_codebert_base_finetuned_defect_cwe_group_detection_en_5.5.1_3.0_1734287574099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("microsoft_codebert_base_finetuned_defect_cwe_group_detection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("microsoft_codebert_base_finetuned_defect_cwe_group_detection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|microsoft_codebert_base_finetuned_defect_cwe_group_detection| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/mcanoglu/microsoft-codebert-base-finetuned-defect-cwe-group-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline_en.md new file mode 100644 index 00000000000000..201543644b628f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline pipeline RoBertaForSequenceClassification from mcanoglu +author: John Snow Labs +name: microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline` is a English model originally trained by mcanoglu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline_en_5.5.1_3.0_1734287601312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline_en_5.5.1_3.0_1734287601312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|microsoft_codebert_base_finetuned_defect_cwe_group_detection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|468.4 MB| + +## References + +https://huggingface.co/mcanoglu/microsoft-codebert-base-finetuned-defect-cwe-group-detection + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-minilmv2_l6_h768_from_bert_large_mrqa_en.md b/docs/_posts/ahmedlone127/2024-12-15-minilmv2_l6_h768_from_bert_large_mrqa_en.md new file mode 100644 index 00000000000000..37bb3fad4c54f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-minilmv2_l6_h768_from_bert_large_mrqa_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English minilmv2_l6_h768_from_bert_large_mrqa BertForQuestionAnswering from VMware +author: John Snow Labs +name: minilmv2_l6_h768_from_bert_large_mrqa +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilmv2_l6_h768_from_bert_large_mrqa` is a English model originally trained by VMware. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h768_from_bert_large_mrqa_en_5.5.1_3.0_1734297060459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h768_from_bert_large_mrqa_en_5.5.1_3.0_1734297060459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("minilmv2_l6_h768_from_bert_large_mrqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("minilmv2_l6_h768_from_bert_large_mrqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilmv2_l6_h768_from_bert_large_mrqa| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.9 MB| + +## References + +https://huggingface.co/VMware/minilmv2-l6-h768-from-bert-large-mrqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-minilmv2_l6_h768_from_bert_large_mrqa_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-minilmv2_l6_h768_from_bert_large_mrqa_pipeline_en.md new file mode 100644 index 00000000000000..ff652ee030ec69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-minilmv2_l6_h768_from_bert_large_mrqa_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English minilmv2_l6_h768_from_bert_large_mrqa_pipeline pipeline BertForQuestionAnswering from VMware +author: John Snow Labs +name: minilmv2_l6_h768_from_bert_large_mrqa_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilmv2_l6_h768_from_bert_large_mrqa_pipeline` is a English model originally trained by VMware. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h768_from_bert_large_mrqa_pipeline_en_5.5.1_3.0_1734297073322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h768_from_bert_large_mrqa_pipeline_en_5.5.1_3.0_1734297073322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("minilmv2_l6_h768_from_bert_large_mrqa_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("minilmv2_l6_h768_from_bert_large_mrqa_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilmv2_l6_h768_from_bert_large_mrqa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.9 MB| + +## References + +https://huggingface.co/VMware/minilmv2-l6-h768-from-bert-large-mrqa + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mobilebert_uncased_squad_v1_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2024-12-15-mobilebert_uncased_squad_v1_finetuned_squad_en.md new file mode 100644 index 00000000000000..0664becd3e63aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mobilebert_uncased_squad_v1_finetuned_squad_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mobilebert_uncased_squad_v1_finetuned_squad BertForQuestionAnswering from Hadjer +author: John Snow Labs +name: mobilebert_uncased_squad_v1_finetuned_squad +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mobilebert_uncased_squad_v1_finetuned_squad` is a English model originally trained by Hadjer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_squad_v1_finetuned_squad_en_5.5.1_3.0_1734297459567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_squad_v1_finetuned_squad_en_5.5.1_3.0_1734297459567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("mobilebert_uncased_squad_v1_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("mobilebert_uncased_squad_v1_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mobilebert_uncased_squad_v1_finetuned_squad| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|92.5 MB| + +## References + +https://huggingface.co/Hadjer/mobilebert-uncased-squad-v1-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mobilebert_uncased_squad_v1_finetuned_squad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mobilebert_uncased_squad_v1_finetuned_squad_pipeline_en.md new file mode 100644 index 00000000000000..d9ac7f29ee5e97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mobilebert_uncased_squad_v1_finetuned_squad_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mobilebert_uncased_squad_v1_finetuned_squad_pipeline pipeline BertForQuestionAnswering from Hadjer +author: John Snow Labs +name: mobilebert_uncased_squad_v1_finetuned_squad_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mobilebert_uncased_squad_v1_finetuned_squad_pipeline` is a English model originally trained by Hadjer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_squad_v1_finetuned_squad_pipeline_en_5.5.1_3.0_1734297463887.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mobilebert_uncased_squad_v1_finetuned_squad_pipeline_en_5.5.1_3.0_1734297463887.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mobilebert_uncased_squad_v1_finetuned_squad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mobilebert_uncased_squad_v1_finetuned_squad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mobilebert_uncased_squad_v1_finetuned_squad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|92.5 MB| + +## References + +https://huggingface.co/Hadjer/mobilebert-uncased-squad-v1-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-model2_en.md b/docs/_posts/ahmedlone127/2024-12-15-model2_en.md new file mode 100644 index 00000000000000..a3cc296d69f558 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-model2_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English model2 DistilBertForQuestionAnswering from Vasu07 +author: John Snow Labs +name: model2 +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, distilbert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model2` is a English model originally trained by Vasu07. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model2_en_5.5.1_3.0_1734286805011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model2_en_5.5.1_3.0_1734286805011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("model2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering.pretrained("model2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +References + +https://huggingface.co/Vasu07/model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-model2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-model2_pipeline_en.md new file mode 100644 index 00000000000000..2123b610fc016b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-model2_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English model2_pipeline pipeline DistilBertForQuestionAnswering from Vasu07 +author: John Snow Labs +name: model2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model2_pipeline` is a English model originally trained by Vasu07. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model2_pipeline_en_5.5.1_3.0_1734286821720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model2_pipeline_en_5.5.1_3.0_1734286821720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("model2_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("model2_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|308.8 MB| + +## References + +References + +https://huggingface.co/Vasu07/model2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_bingcheng9_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_bingcheng9_en.md new file mode 100644 index 00000000000000..bc3a927e290a89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_bingcheng9_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_all_nli_triplet_bingcheng9 MPNetEmbeddings from bingcheng9 +author: John Snow Labs +name: mpnet_base_all_nli_triplet_bingcheng9 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_all_nli_triplet_bingcheng9` is a English model originally trained by bingcheng9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_bingcheng9_en_5.5.1_3.0_1734306915052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_bingcheng9_en_5.5.1_3.0_1734306915052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_all_nli_triplet_bingcheng9","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_all_nli_triplet_bingcheng9","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_all_nli_triplet_bingcheng9| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|390.1 MB| + +## References + +https://huggingface.co/bingcheng9/mpnet-base-all-nli-triplet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_bingcheng9_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_bingcheng9_pipeline_en.md new file mode 100644 index 00000000000000..7186d30a4a1962 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_bingcheng9_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_all_nli_triplet_bingcheng9_pipeline pipeline MPNetEmbeddings from bingcheng9 +author: John Snow Labs +name: mpnet_base_all_nli_triplet_bingcheng9_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_all_nli_triplet_bingcheng9_pipeline` is a English model originally trained by bingcheng9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_bingcheng9_pipeline_en_5.5.1_3.0_1734306944927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_bingcheng9_pipeline_en_5.5.1_3.0_1734306944927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_all_nli_triplet_bingcheng9_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_all_nli_triplet_bingcheng9_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_all_nli_triplet_bingcheng9_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|390.1 MB| + +## References + +https://huggingface.co/bingcheng9/mpnet-base-all-nli-triplet + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_ivanleomk_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_ivanleomk_en.md new file mode 100644 index 00000000000000..e0f1bb58ad850b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_ivanleomk_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_all_nli_triplet_ivanleomk MPNetEmbeddings from ivanleomk +author: John Snow Labs +name: mpnet_base_all_nli_triplet_ivanleomk +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_all_nli_triplet_ivanleomk` is a English model originally trained by ivanleomk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_ivanleomk_en_5.5.1_3.0_1734307014293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_ivanleomk_en_5.5.1_3.0_1734307014293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_all_nli_triplet_ivanleomk","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_all_nli_triplet_ivanleomk","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_all_nli_triplet_ivanleomk| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|376.3 MB| + +## References + +https://huggingface.co/ivanleomk/mpnet-base-all-nli-triplet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_ivanleomk_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_ivanleomk_pipeline_en.md new file mode 100644 index 00000000000000..e0e0019ad996d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_nli_triplet_ivanleomk_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_all_nli_triplet_ivanleomk_pipeline pipeline MPNetEmbeddings from ivanleomk +author: John Snow Labs +name: mpnet_base_all_nli_triplet_ivanleomk_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_all_nli_triplet_ivanleomk_pipeline` is a English model originally trained by ivanleomk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_ivanleomk_pipeline_en_5.5.1_3.0_1734307045885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_all_nli_triplet_ivanleomk_pipeline_en_5.5.1_3.0_1734307045885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_all_nli_triplet_ivanleomk_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_all_nli_triplet_ivanleomk_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_all_nli_triplet_ivanleomk_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|376.3 MB| + +## References + +https://huggingface.co/ivanleomk/mpnet-base-all-nli-triplet + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_obliqa_nmr_3_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_obliqa_nmr_3_en.md new file mode 100644 index 00000000000000..b8a971166e5f5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_obliqa_nmr_3_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_all_obliqa_nmr_3 MPNetEmbeddings from jebish7 +author: John Snow Labs +name: mpnet_base_all_obliqa_nmr_3 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_all_obliqa_nmr_3` is a English model originally trained by jebish7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_all_obliqa_nmr_3_en_5.5.1_3.0_1734305866324.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_all_obliqa_nmr_3_en_5.5.1_3.0_1734305866324.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_all_obliqa_nmr_3","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_all_obliqa_nmr_3","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_all_obliqa_nmr_3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/jebish7/mpnet-base-all-obliqa_NMR_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_obliqa_nmr_3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_obliqa_nmr_3_pipeline_en.md new file mode 100644 index 00000000000000..e29f2820e7ae28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_all_obliqa_nmr_3_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_all_obliqa_nmr_3_pipeline pipeline MPNetEmbeddings from jebish7 +author: John Snow Labs +name: mpnet_base_all_obliqa_nmr_3_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_all_obliqa_nmr_3_pipeline` is a English model originally trained by jebish7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_all_obliqa_nmr_3_pipeline_en_5.5.1_3.0_1734305892289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_all_obliqa_nmr_3_pipeline_en_5.5.1_3.0_1734305892289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_all_obliqa_nmr_3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_all_obliqa_nmr_3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_all_obliqa_nmr_3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/jebish7/mpnet-base-all-obliqa_NMR_3 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_gooaq_cmnrl_mrl_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_gooaq_cmnrl_mrl_en.md new file mode 100644 index 00000000000000..791648d7a9baa8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_gooaq_cmnrl_mrl_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_gooaq_cmnrl_mrl MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_gooaq_cmnrl_mrl +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_gooaq_cmnrl_mrl` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_gooaq_cmnrl_mrl_en_5.5.1_3.0_1734305867594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_gooaq_cmnrl_mrl_en_5.5.1_3.0_1734305867594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_gooaq_cmnrl_mrl","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_gooaq_cmnrl_mrl","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_gooaq_cmnrl_mrl| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-gooaq-cmnrl-mrl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_gooaq_cmnrl_mrl_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_gooaq_cmnrl_mrl_pipeline_en.md new file mode 100644 index 00000000000000..3ace60ffd1a788 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_gooaq_cmnrl_mrl_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_gooaq_cmnrl_mrl_pipeline pipeline MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_gooaq_cmnrl_mrl_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_gooaq_cmnrl_mrl_pipeline` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_gooaq_cmnrl_mrl_pipeline_en_5.5.1_3.0_1734305892455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_gooaq_cmnrl_mrl_pipeline_en_5.5.1_3.0_1734305892455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_gooaq_cmnrl_mrl_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_gooaq_cmnrl_mrl_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_gooaq_cmnrl_mrl_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-gooaq-cmnrl-mrl + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_3_gte_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_3_gte_en.md new file mode 100644 index 00000000000000..6a924d79d9a478 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_3_gte_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_nq_cgist_triplet_3_gte MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_nq_cgist_triplet_3_gte +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_nq_cgist_triplet_3_gte` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_3_gte_en_5.5.1_3.0_1734306350961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_3_gte_en_5.5.1_3.0_1734306350961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_nq_cgist_triplet_3_gte","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_nq_cgist_triplet_3_gte","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_nq_cgist_triplet_3_gte| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|401.8 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-nq-cgist-triplet-3-gte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_3_gte_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_3_gte_pipeline_en.md new file mode 100644 index 00000000000000..b8815bee5f0939 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_3_gte_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_nq_cgist_triplet_3_gte_pipeline pipeline MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_nq_cgist_triplet_3_gte_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_nq_cgist_triplet_3_gte_pipeline` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_3_gte_pipeline_en_5.5.1_3.0_1734306373626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_3_gte_pipeline_en_5.5.1_3.0_1734306373626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_nq_cgist_triplet_3_gte_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_nq_cgist_triplet_3_gte_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_nq_cgist_triplet_3_gte_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|401.8 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-nq-cgist-triplet-3-gte + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_gt_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_gt_en.md new file mode 100644 index 00000000000000..99fa01ce6ac266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_gt_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_nq_cgist_triplet_gt MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_nq_cgist_triplet_gt +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_nq_cgist_triplet_gt` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_gt_en_5.5.1_3.0_1734306623387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_gt_en_5.5.1_3.0_1734306623387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_nq_cgist_triplet_gt","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_nq_cgist_triplet_gt","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_nq_cgist_triplet_gt| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|401.8 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-nq-cgist-triplet-gt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_gt_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_gt_pipeline_en.md new file mode 100644 index 00000000000000..d1bd613f5489b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_gt_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_nq_cgist_triplet_gt_pipeline pipeline MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_nq_cgist_triplet_gt_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_nq_cgist_triplet_gt_pipeline` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_gt_pipeline_en_5.5.1_3.0_1734306646591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_gt_pipeline_en_5.5.1_3.0_1734306646591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_nq_cgist_triplet_gt_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_nq_cgist_triplet_gt_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_nq_cgist_triplet_gt_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|401.8 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-nq-cgist-triplet-gt + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_neg_gte_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_neg_gte_en.md new file mode 100644 index 00000000000000..435e1161dde181 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_neg_gte_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_base_nq_cgist_triplet_neg_gte MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_nq_cgist_triplet_neg_gte +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_nq_cgist_triplet_neg_gte` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_neg_gte_en_5.5.1_3.0_1734305700898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_neg_gte_en_5.5.1_3.0_1734305700898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_base_nq_cgist_triplet_neg_gte","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_base_nq_cgist_triplet_neg_gte","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_nq_cgist_triplet_neg_gte| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|401.8 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-nq-cgist-triplet-neg-gte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_neg_gte_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_neg_gte_pipeline_en.md new file mode 100644 index 00000000000000..0d3e2860da02ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_base_nq_cgist_triplet_neg_gte_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_base_nq_cgist_triplet_neg_gte_pipeline pipeline MPNetEmbeddings from tomaarsen +author: John Snow Labs +name: mpnet_base_nq_cgist_triplet_neg_gte_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_base_nq_cgist_triplet_neg_gte_pipeline` is a English model originally trained by tomaarsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_neg_gte_pipeline_en_5.5.1_3.0_1734305730041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_base_nq_cgist_triplet_neg_gte_pipeline_en_5.5.1_3.0_1734305730041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_base_nq_cgist_triplet_neg_gte_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_base_nq_cgist_triplet_neg_gte_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_base_nq_cgist_triplet_neg_gte_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|401.8 MB| + +## References + +https://huggingface.co/tomaarsen/mpnet-base-nq-cgist-triplet-neg-gte + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_recursive_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_recursive_en.md new file mode 100644 index 00000000000000..c1db2d12baf529 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_recursive_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_finetuned_recursive MPNetEmbeddings from jet-taekyo +author: John Snow Labs +name: mpnet_finetuned_recursive +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_finetuned_recursive` is a English model originally trained by jet-taekyo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_recursive_en_5.5.1_3.0_1734306019757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_recursive_en_5.5.1_3.0_1734306019757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_finetuned_recursive","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_finetuned_recursive","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_finetuned_recursive| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jet-taekyo/mpnet_finetuned_recursive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_recursive_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_recursive_pipeline_en.md new file mode 100644 index 00000000000000..e444e47324eff9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_recursive_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_finetuned_recursive_pipeline pipeline MPNetEmbeddings from jet-taekyo +author: John Snow Labs +name: mpnet_finetuned_recursive_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_finetuned_recursive_pipeline` is a English model originally trained by jet-taekyo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_recursive_pipeline_en_5.5.1_3.0_1734306040414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_recursive_pipeline_en_5.5.1_3.0_1734306040414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_finetuned_recursive_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_finetuned_recursive_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_finetuned_recursive_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jet-taekyo/mpnet_finetuned_recursive + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_semantic_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_semantic_en.md new file mode 100644 index 00000000000000..58d2a7f9afdd88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_semantic_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mpnet_finetuned_semantic MPNetEmbeddings from jet-taekyo +author: John Snow Labs +name: mpnet_finetuned_semantic +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_finetuned_semantic` is a English model originally trained by jet-taekyo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_semantic_en_5.5.1_3.0_1734306519112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_semantic_en_5.5.1_3.0_1734306519112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("mpnet_finetuned_semantic","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("mpnet_finetuned_semantic","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_finetuned_semantic| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/jet-taekyo/mpnet_finetuned_semantic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_semantic_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_semantic_pipeline_en.md new file mode 100644 index 00000000000000..5f02d77499eda3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mpnet_finetuned_semantic_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mpnet_finetuned_semantic_pipeline pipeline MPNetEmbeddings from jet-taekyo +author: John Snow Labs +name: mpnet_finetuned_semantic_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mpnet_finetuned_semantic_pipeline` is a English model originally trained by jet-taekyo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_semantic_pipeline_en_5.5.1_3.0_1734306543049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mpnet_finetuned_semantic_pipeline_en_5.5.1_3.0_1734306543049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mpnet_finetuned_semantic_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mpnet_finetuned_semantic_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mpnet_finetuned_semantic_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/jet-taekyo/mpnet_finetuned_semantic + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_anaphora_czech_6e_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_anaphora_czech_6e_en.md new file mode 100644 index 00000000000000..0ac996392ba6e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_anaphora_czech_6e_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_base_anaphora_czech_6e T5Transformer from patrixtano +author: John Snow Labs +name: mt5_base_anaphora_czech_6e +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_anaphora_czech_6e` is a English model originally trained by patrixtano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_anaphora_czech_6e_en_5.5.1_3.0_1734300607784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_anaphora_czech_6e_en_5.5.1_3.0_1734300607784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_base_anaphora_czech_6e","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_base_anaphora_czech_6e", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_anaphora_czech_6e| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|2.3 GB| + +## References + +https://huggingface.co/patrixtano/mt5-base-anaphora_czech_6e \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_anaphora_czech_6e_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_anaphora_czech_6e_pipeline_en.md new file mode 100644 index 00000000000000..1eaebf9f048ee3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_anaphora_czech_6e_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_anaphora_czech_6e_pipeline pipeline T5Transformer from patrixtano +author: John Snow Labs +name: mt5_base_anaphora_czech_6e_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_anaphora_czech_6e_pipeline` is a English model originally trained by patrixtano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_anaphora_czech_6e_pipeline_en_5.5.1_3.0_1734300835682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_anaphora_czech_6e_pipeline_en_5.5.1_3.0_1734300835682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_anaphora_czech_6e_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_anaphora_czech_6e_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_anaphora_czech_6e_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.3 GB| + +## References + +https://huggingface.co/patrixtano/mt5-base-anaphora_czech_6e + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_en.md new file mode 100644 index 00000000000000..d1ae0d82a8c117 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_base_english_spider T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_english_spider +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_english_spider` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_en_5.5.1_3.0_1734300173911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_en_5.5.1_3.0_1734300173911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_base_english_spider","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_base_english_spider", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_english_spider| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5-base_EN_spider \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_norwegian_decode_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_norwegian_decode_en.md new file mode 100644 index 00000000000000..1ebbe8da729432 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_norwegian_decode_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_base_english_spider_norwegian_decode T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_english_spider_norwegian_decode +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_english_spider_norwegian_decode` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_norwegian_decode_en_5.5.1_3.0_1734300583756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_norwegian_decode_en_5.5.1_3.0_1734300583756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_base_english_spider_norwegian_decode","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_base_english_spider_norwegian_decode", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_english_spider_norwegian_decode| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5-base_EN_spider_no_decode \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_norwegian_decode_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_norwegian_decode_pipeline_en.md new file mode 100644 index 00000000000000..dd92be8169c7c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_norwegian_decode_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_english_spider_norwegian_decode_pipeline pipeline T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_english_spider_norwegian_decode_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_english_spider_norwegian_decode_pipeline` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_norwegian_decode_pipeline_en_5.5.1_3.0_1734301064430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_norwegian_decode_pipeline_en_5.5.1_3.0_1734301064430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_english_spider_norwegian_decode_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_english_spider_norwegian_decode_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_english_spider_norwegian_decode_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5-base_EN_spider_no_decode + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_pipeline_en.md new file mode 100644 index 00000000000000..1e39f839d4d752 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_english_spider_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_english_spider_pipeline pipeline T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_english_spider_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_english_spider_pipeline` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_pipeline_en_5.5.1_3.0_1734300650482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_english_spider_pipeline_en_5.5.1_3.0_1734300650482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_english_spider_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_english_spider_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_english_spider_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5-base_EN_spider + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_kirakira_names_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_kirakira_names_en.md new file mode 100644 index 00000000000000..64b1ab88eb92b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_kirakira_names_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_base_kirakira_names T5Transformer from umisato +author: John Snow Labs +name: mt5_base_kirakira_names +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_kirakira_names` is a English model originally trained by umisato. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_kirakira_names_en_5.5.1_3.0_1734301014915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_kirakira_names_en_5.5.1_3.0_1734301014915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_base_kirakira_names","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_base_kirakira_names", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_kirakira_names| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|2.2 GB| + +## References + +https://huggingface.co/umisato/mt5-base-kirakira-names \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_base_kirakira_names_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_kirakira_names_pipeline_en.md new file mode 100644 index 00000000000000..4d89a0c5f3b631 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_base_kirakira_names_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_kirakira_names_pipeline pipeline T5Transformer from umisato +author: John Snow Labs +name: mt5_base_kirakira_names_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_kirakira_names_pipeline` is a English model originally trained by umisato. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_kirakira_names_pipeline_en_5.5.1_3.0_1734301181140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_kirakira_names_pipeline_en_5.5.1_3.0_1734301181140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_kirakira_names_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_kirakira_names_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_kirakira_names_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.2 GB| + +## References + +https://huggingface.co/umisato/mt5-base-kirakira-names + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_small_ainu_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_ainu_en.md new file mode 100644 index 00000000000000..4f027b3ae9c7c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_ainu_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_small_ainu T5Transformer from aynumosir +author: John Snow Labs +name: mt5_small_ainu +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_ainu` is a English model originally trained by aynumosir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_ainu_en_5.5.1_3.0_1734302309092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_ainu_en_5.5.1_3.0_1734302309092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_small_ainu","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_small_ainu", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_ainu| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/aynumosir/mt5-small-ainu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_small_ainu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_ainu_pipeline_en.md new file mode 100644 index 00000000000000..094acc61752c9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_ainu_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_small_ainu_pipeline pipeline T5Transformer from aynumosir +author: John Snow Labs +name: mt5_small_ainu_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_ainu_pipeline` is a English model originally trained by aynumosir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_ainu_pipeline_en_5.5.1_3.0_1734302401528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_ainu_pipeline_en_5.5.1_3.0_1734302401528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_small_ainu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_small_ainu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_ainu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/aynumosir/mt5-small-ainu + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_small_anaphora_czech_6e_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_anaphora_czech_6e_en.md new file mode 100644 index 00000000000000..704b8e88730aa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_anaphora_czech_6e_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_small_anaphora_czech_6e T5Transformer from patrixtano +author: John Snow Labs +name: mt5_small_anaphora_czech_6e +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_anaphora_czech_6e` is a English model originally trained by patrixtano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_anaphora_czech_6e_en_5.5.1_3.0_1734301607961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_anaphora_czech_6e_en_5.5.1_3.0_1734301607961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_small_anaphora_czech_6e","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_small_anaphora_czech_6e", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_anaphora_czech_6e| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/patrixtano/mt5-small-anaphora_czech_6e \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_small_anaphora_czech_6e_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_anaphora_czech_6e_pipeline_en.md new file mode 100644 index 00000000000000..e14ddd1367570f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_anaphora_czech_6e_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_small_anaphora_czech_6e_pipeline pipeline T5Transformer from patrixtano +author: John Snow Labs +name: mt5_small_anaphora_czech_6e_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_anaphora_czech_6e_pipeline` is a English model originally trained by patrixtano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_anaphora_czech_6e_pipeline_en_5.5.1_3.0_1734301743622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_anaphora_czech_6e_pipeline_en_5.5.1_3.0_1734301743622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_small_anaphora_czech_6e_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_small_anaphora_czech_6e_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_anaphora_czech_6e_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/patrixtano/mt5-small-anaphora_czech_6e + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_small_hsuuhsuu_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_hsuuhsuu_en.md new file mode 100644 index 00000000000000..aad7ae20611cf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_hsuuhsuu_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_small_hsuuhsuu T5Transformer from HsuuHsuu +author: John Snow Labs +name: mt5_small_hsuuhsuu +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_hsuuhsuu` is a English model originally trained by HsuuHsuu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_hsuuhsuu_en_5.5.1_3.0_1734299307077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_hsuuhsuu_en_5.5.1_3.0_1734299307077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_small_hsuuhsuu","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_small_hsuuhsuu", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_hsuuhsuu| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/HsuuHsuu/mt5-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-mt5_small_hsuuhsuu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_hsuuhsuu_pipeline_en.md new file mode 100644 index 00000000000000..409dad704d2120 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-mt5_small_hsuuhsuu_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_small_hsuuhsuu_pipeline pipeline T5Transformer from HsuuHsuu +author: John Snow Labs +name: mt5_small_hsuuhsuu_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_hsuuhsuu_pipeline` is a English model originally trained by HsuuHsuu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_hsuuhsuu_pipeline_en_5.5.1_3.0_1734299443386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_hsuuhsuu_pipeline_en_5.5.1_3.0_1734299443386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_small_hsuuhsuu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_small_hsuuhsuu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_hsuuhsuu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/HsuuHsuu/mt5-small + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-multi_qa_mpnet_base_dot_v1_4_frozen_en.md b/docs/_posts/ahmedlone127/2024-12-15-multi_qa_mpnet_base_dot_v1_4_frozen_en.md new file mode 100644 index 00000000000000..0f0660d9b128c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-multi_qa_mpnet_base_dot_v1_4_frozen_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English multi_qa_mpnet_base_dot_v1_4_frozen MPNetEmbeddings from yashmalviya +author: John Snow Labs +name: multi_qa_mpnet_base_dot_v1_4_frozen +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_qa_mpnet_base_dot_v1_4_frozen` is a English model originally trained by yashmalviya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_4_frozen_en_5.5.1_3.0_1734305699333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_4_frozen_en_5.5.1_3.0_1734305699333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("multi_qa_mpnet_base_dot_v1_4_frozen","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("multi_qa_mpnet_base_dot_v1_4_frozen","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_qa_mpnet_base_dot_v1_4_frozen| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/yashmalviya/multi-qa-mpnet-base-dot-v1-4-frozen \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-multi_qa_mpnet_base_dot_v1_4_frozen_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-multi_qa_mpnet_base_dot_v1_4_frozen_pipeline_en.md new file mode 100644 index 00000000000000..dc3fad4bd1d0ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-multi_qa_mpnet_base_dot_v1_4_frozen_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English multi_qa_mpnet_base_dot_v1_4_frozen_pipeline pipeline MPNetEmbeddings from yashmalviya +author: John Snow Labs +name: multi_qa_mpnet_base_dot_v1_4_frozen_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_qa_mpnet_base_dot_v1_4_frozen_pipeline` is a English model originally trained by yashmalviya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_4_frozen_pipeline_en_5.5.1_3.0_1734305723173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_4_frozen_pipeline_en_5.5.1_3.0_1734305723173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("multi_qa_mpnet_base_dot_v1_4_frozen_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("multi_qa_mpnet_base_dot_v1_4_frozen_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_qa_mpnet_base_dot_v1_4_frozen_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/yashmalviya/multi-qa-mpnet-base-dot-v1-4-frozen + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-multi_sbert_en.md b/docs/_posts/ahmedlone127/2024-12-15-multi_sbert_en.md new file mode 100644 index 00000000000000..0a9c44efdb7e22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-multi_sbert_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English multi_sbert MPNetEmbeddings from Gnartiel +author: John Snow Labs +name: multi_sbert +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_sbert` is a English model originally trained by Gnartiel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_sbert_en_5.5.1_3.0_1734306543708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_sbert_en_5.5.1_3.0_1734306543708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("multi_sbert","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("multi_sbert","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_sbert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Gnartiel/multi-sbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-multi_sbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-multi_sbert_pipeline_en.md new file mode 100644 index 00000000000000..9f525fb2cbb5ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-multi_sbert_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English multi_sbert_pipeline pipeline MPNetEmbeddings from Gnartiel +author: John Snow Labs +name: multi_sbert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_sbert_pipeline` is a English model originally trained by Gnartiel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_sbert_pipeline_en_5.5.1_3.0_1734306569071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_sbert_pipeline_en_5.5.1_3.0_1734306569071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("multi_sbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("multi_sbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_sbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Gnartiel/multi-sbert + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-muril_squad_nep_translated_squad_en.md b/docs/_posts/ahmedlone127/2024-12-15-muril_squad_nep_translated_squad_en.md new file mode 100644 index 00000000000000..7350d310a01d35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-muril_squad_nep_translated_squad_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English muril_squad_nep_translated_squad BertForQuestionAnswering from suban244 +author: John Snow Labs +name: muril_squad_nep_translated_squad +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`muril_squad_nep_translated_squad` is a English model originally trained by suban244. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/muril_squad_nep_translated_squad_en_5.5.1_3.0_1734296941556.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/muril_squad_nep_translated_squad_en_5.5.1_3.0_1734296941556.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("muril_squad_nep_translated_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("muril_squad_nep_translated_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|muril_squad_nep_translated_squad| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|890.4 MB| + +## References + +https://huggingface.co/suban244/muRIL-squad-nep-translated-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-muril_squad_nep_translated_squad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-muril_squad_nep_translated_squad_pipeline_en.md new file mode 100644 index 00000000000000..fb2174e455652b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-muril_squad_nep_translated_squad_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English muril_squad_nep_translated_squad_pipeline pipeline BertForQuestionAnswering from suban244 +author: John Snow Labs +name: muril_squad_nep_translated_squad_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`muril_squad_nep_translated_squad_pipeline` is a English model originally trained by suban244. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/muril_squad_nep_translated_squad_pipeline_en_5.5.1_3.0_1734296986208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/muril_squad_nep_translated_squad_pipeline_en_5.5.1_3.0_1734296986208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("muril_squad_nep_translated_squad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("muril_squad_nep_translated_squad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|muril_squad_nep_translated_squad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|890.4 MB| + +## References + +https://huggingface.co/suban244/muRIL-squad-nep-translated-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-nli_roberta_base_finetuned_for_amazon_review_ratings_en.md b/docs/_posts/ahmedlone127/2024-12-15-nli_roberta_base_finetuned_for_amazon_review_ratings_en.md new file mode 100644 index 00000000000000..8850dcab4efca0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-nli_roberta_base_finetuned_for_amazon_review_ratings_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English nli_roberta_base_finetuned_for_amazon_review_ratings RoBertaForSequenceClassification from ktdent +author: John Snow Labs +name: nli_roberta_base_finetuned_for_amazon_review_ratings +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nli_roberta_base_finetuned_for_amazon_review_ratings` is a English model originally trained by ktdent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_en_5.5.1_3.0_1734287635115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_en_5.5.1_3.0_1734287635115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nli_roberta_base_finetuned_for_amazon_review_ratings| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.9 MB| + +## References + +https://huggingface.co/ktdent/nli-roberta-base-finetuned-for-amazon-review-ratings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline_en.md new file mode 100644 index 00000000000000..3bfb8a51592db7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline pipeline RoBertaForSequenceClassification from ktdent +author: John Snow Labs +name: nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline` is a English model originally trained by ktdent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline_en_5.5.1_3.0_1734287666041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline_en_5.5.1_3.0_1734287666041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nli_roberta_base_finetuned_for_amazon_review_ratings_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|466.0 MB| + +## References + +https://huggingface.co/ktdent/nli-roberta-base-finetuned-for-amazon-review-ratings + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-non_green_assamese_train_context_roberta_large_test_en.md b/docs/_posts/ahmedlone127/2024-12-15-non_green_assamese_train_context_roberta_large_test_en.md new file mode 100644 index 00000000000000..a5ae0c86cb31b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-non_green_assamese_train_context_roberta_large_test_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English non_green_assamese_train_context_roberta_large_test RoBertaForSequenceClassification from kghanlon +author: John Snow Labs +name: non_green_assamese_train_context_roberta_large_test +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`non_green_assamese_train_context_roberta_large_test` is a English model originally trained by kghanlon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/non_green_assamese_train_context_roberta_large_test_en_5.5.1_3.0_1734287178263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/non_green_assamese_train_context_roberta_large_test_en_5.5.1_3.0_1734287178263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("non_green_assamese_train_context_roberta_large_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("non_green_assamese_train_context_roberta_large_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|non_green_assamese_train_context_roberta_large_test| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/kghanlon/non_green_as_train_context_roberta-large_TEST \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-non_green_assamese_train_context_roberta_large_test_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-non_green_assamese_train_context_roberta_large_test_pipeline_en.md new file mode 100644 index 00000000000000..7de82169807609 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-non_green_assamese_train_context_roberta_large_test_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English non_green_assamese_train_context_roberta_large_test_pipeline pipeline RoBertaForSequenceClassification from kghanlon +author: John Snow Labs +name: non_green_assamese_train_context_roberta_large_test_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`non_green_assamese_train_context_roberta_large_test_pipeline` is a English model originally trained by kghanlon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/non_green_assamese_train_context_roberta_large_test_pipeline_en_5.5.1_3.0_1734287254749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/non_green_assamese_train_context_roberta_large_test_pipeline_en_5.5.1_3.0_1734287254749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("non_green_assamese_train_context_roberta_large_test_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("non_green_assamese_train_context_roberta_large_test_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|non_green_assamese_train_context_roberta_large_test_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/kghanlon/non_green_as_train_context_roberta-large_TEST + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-paraphrase_mpnet_base_v2_mbti_en.md b/docs/_posts/ahmedlone127/2024-12-15-paraphrase_mpnet_base_v2_mbti_en.md new file mode 100644 index 00000000000000..03db7617613b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-paraphrase_mpnet_base_v2_mbti_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English paraphrase_mpnet_base_v2_mbti MPNetForSequenceClassification from ClaudiaRichard +author: John Snow Labs +name: paraphrase_mpnet_base_v2_mbti +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, mpnet] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paraphrase_mpnet_base_v2_mbti` is a English model originally trained by ClaudiaRichard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paraphrase_mpnet_base_v2_mbti_en_5.5.1_3.0_1734295031430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paraphrase_mpnet_base_v2_mbti_en_5.5.1_3.0_1734295031430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = MPNetForSequenceClassification.pretrained("paraphrase_mpnet_base_v2_mbti","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = MPNetForSequenceClassification.pretrained("paraphrase_mpnet_base_v2_mbti", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paraphrase_mpnet_base_v2_mbti| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.2 MB| + +## References + +https://huggingface.co/ClaudiaRichard/paraphrase-mpnet-base-v2_mbti \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-paraphrase_mpnet_base_v2_mbti_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-paraphrase_mpnet_base_v2_mbti_pipeline_en.md new file mode 100644 index 00000000000000..0763a0652ac391 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-paraphrase_mpnet_base_v2_mbti_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English paraphrase_mpnet_base_v2_mbti_pipeline pipeline MPNetForSequenceClassification from ClaudiaRichard +author: John Snow Labs +name: paraphrase_mpnet_base_v2_mbti_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paraphrase_mpnet_base_v2_mbti_pipeline` is a English model originally trained by ClaudiaRichard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paraphrase_mpnet_base_v2_mbti_pipeline_en_5.5.1_3.0_1734295052150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paraphrase_mpnet_base_v2_mbti_pipeline_en_5.5.1_3.0_1734295052150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("paraphrase_mpnet_base_v2_mbti_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("paraphrase_mpnet_base_v2_mbti_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paraphrase_mpnet_base_v2_mbti_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.2 MB| + +## References + +https://huggingface.co/ClaudiaRichard/paraphrase-mpnet-base-v2_mbti + +## Included Models + +- DocumentAssembler +- TokenizerModel +- MPNetForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-pretrain_rugec_msu_en.md b/docs/_posts/ahmedlone127/2024-12-15-pretrain_rugec_msu_en.md new file mode 100644 index 00000000000000..1639dc9a06dd8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-pretrain_rugec_msu_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English pretrain_rugec_msu T5Transformer from mika5883 +author: John Snow Labs +name: pretrain_rugec_msu +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pretrain_rugec_msu` is a English model originally trained by mika5883. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pretrain_rugec_msu_en_5.5.1_3.0_1734301250753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pretrain_rugec_msu_en_5.5.1_3.0_1734301250753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("pretrain_rugec_msu","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("pretrain_rugec_msu", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pretrain_rugec_msu| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/mika5883/pretrain_rugec_msu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-pretrain_rugec_msu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-pretrain_rugec_msu_pipeline_en.md new file mode 100644 index 00000000000000..f1a384cfbaedf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-pretrain_rugec_msu_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English pretrain_rugec_msu_pipeline pipeline T5Transformer from mika5883 +author: John Snow Labs +name: pretrain_rugec_msu_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pretrain_rugec_msu_pipeline` is a English model originally trained by mika5883. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pretrain_rugec_msu_pipeline_en_5.5.1_3.0_1734301317234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pretrain_rugec_msu_pipeline_en_5.5.1_3.0_1734301317234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("pretrain_rugec_msu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("pretrain_rugec_msu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pretrain_rugec_msu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/mika5883/pretrain_rugec_msu + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-procedure_tool_matching_3_epochs_en.md b/docs/_posts/ahmedlone127/2024-12-15-procedure_tool_matching_3_epochs_en.md new file mode 100644 index 00000000000000..5ada16423fce93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-procedure_tool_matching_3_epochs_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English procedure_tool_matching_3_epochs MPNetEmbeddings from brilan +author: John Snow Labs +name: procedure_tool_matching_3_epochs +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`procedure_tool_matching_3_epochs` is a English model originally trained by brilan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/procedure_tool_matching_3_epochs_en_5.5.1_3.0_1734306469752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/procedure_tool_matching_3_epochs_en_5.5.1_3.0_1734306469752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("procedure_tool_matching_3_epochs","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("procedure_tool_matching_3_epochs","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|procedure_tool_matching_3_epochs| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/brilan/procedure-tool-matching_3_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-procedure_tool_matching_3_epochs_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-procedure_tool_matching_3_epochs_pipeline_en.md new file mode 100644 index 00000000000000..a5a52821a022d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-procedure_tool_matching_3_epochs_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English procedure_tool_matching_3_epochs_pipeline pipeline MPNetEmbeddings from brilan +author: John Snow Labs +name: procedure_tool_matching_3_epochs_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`procedure_tool_matching_3_epochs_pipeline` is a English model originally trained by brilan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/procedure_tool_matching_3_epochs_pipeline_en_5.5.1_3.0_1734306490666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/procedure_tool_matching_3_epochs_pipeline_en_5.5.1_3.0_1734306490666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("procedure_tool_matching_3_epochs_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("procedure_tool_matching_3_epochs_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|procedure_tool_matching_3_epochs_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/brilan/procedure-tool-matching_3_epochs + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-product_model_16_10_24_en.md b/docs/_posts/ahmedlone127/2024-12-15-product_model_16_10_24_en.md new file mode 100644 index 00000000000000..8c284df33dfa90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-product_model_16_10_24_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English product_model_16_10_24 MPNetEmbeddings from alpcansoydas +author: John Snow Labs +name: product_model_16_10_24 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`product_model_16_10_24` is a English model originally trained by alpcansoydas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/product_model_16_10_24_en_5.5.1_3.0_1734305934802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/product_model_16_10_24_en_5.5.1_3.0_1734305934802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("product_model_16_10_24","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("product_model_16_10_24","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|product_model_16_10_24| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/alpcansoydas/product-model-16.10.24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-product_model_16_10_24_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-product_model_16_10_24_pipeline_en.md new file mode 100644 index 00000000000000..8f7b93f2eae33d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-product_model_16_10_24_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English product_model_16_10_24_pipeline pipeline MPNetEmbeddings from alpcansoydas +author: John Snow Labs +name: product_model_16_10_24_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`product_model_16_10_24_pipeline` is a English model originally trained by alpcansoydas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/product_model_16_10_24_pipeline_en_5.5.1_3.0_1734305955434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/product_model_16_10_24_pipeline_en_5.5.1_3.0_1734305955434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("product_model_16_10_24_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("product_model_16_10_24_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|product_model_16_10_24_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/alpcansoydas/product-model-16.10.24 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-product_model_17_10_24_ifhavemorethan100sampleperfamily_en.md b/docs/_posts/ahmedlone127/2024-12-15-product_model_17_10_24_ifhavemorethan100sampleperfamily_en.md new file mode 100644 index 00000000000000..0222ba53438124 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-product_model_17_10_24_ifhavemorethan100sampleperfamily_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English product_model_17_10_24_ifhavemorethan100sampleperfamily MPNetEmbeddings from alpcansoydas +author: John Snow Labs +name: product_model_17_10_24_ifhavemorethan100sampleperfamily +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`product_model_17_10_24_ifhavemorethan100sampleperfamily` is a English model originally trained by alpcansoydas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/product_model_17_10_24_ifhavemorethan100sampleperfamily_en_5.5.1_3.0_1734305699352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/product_model_17_10_24_ifhavemorethan100sampleperfamily_en_5.5.1_3.0_1734305699352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("product_model_17_10_24_ifhavemorethan100sampleperfamily","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("product_model_17_10_24_ifhavemorethan100sampleperfamily","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|product_model_17_10_24_ifhavemorethan100sampleperfamily| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/alpcansoydas/product-model-17.10.24-ifhavemorethan100sampleperfamily \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline_en.md new file mode 100644 index 00000000000000..3d264fc5323ffa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline pipeline MPNetEmbeddings from alpcansoydas +author: John Snow Labs +name: product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline` is a English model originally trained by alpcansoydas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline_en_5.5.1_3.0_1734305728582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline_en_5.5.1_3.0_1734305728582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|product_model_17_10_24_ifhavemorethan100sampleperfamily_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/alpcansoydas/product-model-17.10.24-ifhavemorethan100sampleperfamily + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-prompt_enhancement_model_en.md b/docs/_posts/ahmedlone127/2024-12-15-prompt_enhancement_model_en.md new file mode 100644 index 00000000000000..910653cdcca82e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-prompt_enhancement_model_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English prompt_enhancement_model T5Transformer from K00B404 +author: John Snow Labs +name: prompt_enhancement_model +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prompt_enhancement_model` is a English model originally trained by K00B404. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prompt_enhancement_model_en_5.5.1_3.0_1734299117073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prompt_enhancement_model_en_5.5.1_3.0_1734299117073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("prompt_enhancement_model","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("prompt_enhancement_model", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prompt_enhancement_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|960.2 MB| + +## References + +https://huggingface.co/K00B404/prompt-enhancement-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-prompt_enhancement_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-prompt_enhancement_model_pipeline_en.md new file mode 100644 index 00000000000000..8a1774fefac213 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-prompt_enhancement_model_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English prompt_enhancement_model_pipeline pipeline T5Transformer from K00B404 +author: John Snow Labs +name: prompt_enhancement_model_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prompt_enhancement_model_pipeline` is a English model originally trained by K00B404. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prompt_enhancement_model_pipeline_en_5.5.1_3.0_1734299177560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prompt_enhancement_model_pipeline_en_5.5.1_3.0_1734299177560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("prompt_enhancement_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("prompt_enhancement_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prompt_enhancement_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|960.2 MB| + +## References + +https://huggingface.co/K00B404/prompt-enhancement-model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-pubmed_bert_mlm_squad_covidqa_en.md b/docs/_posts/ahmedlone127/2024-12-15-pubmed_bert_mlm_squad_covidqa_en.md new file mode 100644 index 00000000000000..1561f1a81a44c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-pubmed_bert_mlm_squad_covidqa_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English pubmed_bert_mlm_squad_covidqa BertForQuestionAnswering from Sarmila +author: John Snow Labs +name: pubmed_bert_mlm_squad_covidqa +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmed_bert_mlm_squad_covidqa` is a English model originally trained by Sarmila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmed_bert_mlm_squad_covidqa_en_5.5.1_3.0_1734297552999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmed_bert_mlm_squad_covidqa_en_5.5.1_3.0_1734297552999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("pubmed_bert_mlm_squad_covidqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("pubmed_bert_mlm_squad_covidqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmed_bert_mlm_squad_covidqa| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/Sarmila/pubmed-bert-mlm-squad-covidqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-pubmed_bert_mlm_squad_covidqa_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-pubmed_bert_mlm_squad_covidqa_pipeline_en.md new file mode 100644 index 00000000000000..6c61768884794f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-pubmed_bert_mlm_squad_covidqa_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English pubmed_bert_mlm_squad_covidqa_pipeline pipeline BertForQuestionAnswering from Sarmila +author: John Snow Labs +name: pubmed_bert_mlm_squad_covidqa_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmed_bert_mlm_squad_covidqa_pipeline` is a English model originally trained by Sarmila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmed_bert_mlm_squad_covidqa_pipeline_en_5.5.1_3.0_1734297575356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmed_bert_mlm_squad_covidqa_pipeline_en_5.5.1_3.0_1734297575356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("pubmed_bert_mlm_squad_covidqa_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("pubmed_bert_mlm_squad_covidqa_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmed_bert_mlm_squad_covidqa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/Sarmila/pubmed-bert-mlm-squad-covidqa + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-question_decomposer_t5_en.md b/docs/_posts/ahmedlone127/2024-12-15-question_decomposer_t5_en.md new file mode 100644 index 00000000000000..e6a92337d8d94a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-question_decomposer_t5_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English question_decomposer_t5 T5Transformer from thenHung +author: John Snow Labs +name: question_decomposer_t5 +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_decomposer_t5` is a English model originally trained by thenHung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_decomposer_t5_en_5.5.1_3.0_1734301461656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_decomposer_t5_en_5.5.1_3.0_1734301461656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("question_decomposer_t5","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("question_decomposer_t5", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_decomposer_t5| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|347.3 MB| + +## References + +https://huggingface.co/thenHung/question_decomposer_t5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-question_decomposer_t5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-question_decomposer_t5_pipeline_en.md new file mode 100644 index 00000000000000..45882ca786fecd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-question_decomposer_t5_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English question_decomposer_t5_pipeline pipeline T5Transformer from thenHung +author: John Snow Labs +name: question_decomposer_t5_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_decomposer_t5_pipeline` is a English model originally trained by thenHung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_decomposer_t5_pipeline_en_5.5.1_3.0_1734301480215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_decomposer_t5_pipeline_en_5.5.1_3.0_1734301480215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("question_decomposer_t5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("question_decomposer_t5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_decomposer_t5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|347.3 MB| + +## References + +https://huggingface.co/thenHung/question_decomposer_t5 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-ratemyprofessor_distilbert_en.md b/docs/_posts/ahmedlone127/2024-12-15-ratemyprofessor_distilbert_en.md new file mode 100644 index 00000000000000..04c5c164a81953 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-ratemyprofessor_distilbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ratemyprofessor_distilbert DistilBertEmbeddings from ricebucket +author: John Snow Labs +name: ratemyprofessor_distilbert +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ratemyprofessor_distilbert` is a English model originally trained by ricebucket. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ratemyprofessor_distilbert_en_5.5.1_3.0_1734289503015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ratemyprofessor_distilbert_en_5.5.1_3.0_1734289503015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("ratemyprofessor_distilbert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("ratemyprofessor_distilbert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ratemyprofessor_distilbert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ricebucket/RateMyProfessor_DistilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-ratemyprofessor_distilbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-ratemyprofessor_distilbert_pipeline_en.md new file mode 100644 index 00000000000000..520bb7080970ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-ratemyprofessor_distilbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ratemyprofessor_distilbert_pipeline pipeline DistilBertEmbeddings from ricebucket +author: John Snow Labs +name: ratemyprofessor_distilbert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ratemyprofessor_distilbert_pipeline` is a English model originally trained by ricebucket. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ratemyprofessor_distilbert_pipeline_en_5.5.1_3.0_1734289516437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ratemyprofessor_distilbert_pipeline_en_5.5.1_3.0_1734289516437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ratemyprofessor_distilbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ratemyprofessor_distilbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ratemyprofessor_distilbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ricebucket/RateMyProfessor_DistilBERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-refpydst_100p_referredstates_en.md b/docs/_posts/ahmedlone127/2024-12-15-refpydst_100p_referredstates_en.md new file mode 100644 index 00000000000000..eddc11d098a94f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-refpydst_100p_referredstates_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English refpydst_100p_referredstates MPNetEmbeddings from Brendan +author: John Snow Labs +name: refpydst_100p_referredstates +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`refpydst_100p_referredstates` is a English model originally trained by Brendan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/refpydst_100p_referredstates_en_5.5.1_3.0_1734306206391.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/refpydst_100p_referredstates_en_5.5.1_3.0_1734306206391.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("refpydst_100p_referredstates","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("refpydst_100p_referredstates","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|refpydst_100p_referredstates| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Brendan/refpydst-100p-referredstates \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-refpydst_100p_referredstates_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-refpydst_100p_referredstates_pipeline_en.md new file mode 100644 index 00000000000000..e0fd1233f59dc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-refpydst_100p_referredstates_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English refpydst_100p_referredstates_pipeline pipeline MPNetEmbeddings from Brendan +author: John Snow Labs +name: refpydst_100p_referredstates_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`refpydst_100p_referredstates_pipeline` is a English model originally trained by Brendan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/refpydst_100p_referredstates_pipeline_en_5.5.1_3.0_1734306227645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/refpydst_100p_referredstates_pipeline_en_5.5.1_3.0_1734306227645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("refpydst_100p_referredstates_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("refpydst_100p_referredstates_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|refpydst_100p_referredstates_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Brendan/refpydst-100p-referredstates + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-rephrase_ai_en.md b/docs/_posts/ahmedlone127/2024-12-15-rephrase_ai_en.md new file mode 100644 index 00000000000000..ebce3f55b07564 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-rephrase_ai_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English rephrase_ai T5Transformer from sahilselokar +author: John Snow Labs +name: rephrase_ai +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rephrase_ai` is a English model originally trained by sahilselokar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rephrase_ai_en_5.5.1_3.0_1734302472419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rephrase_ai_en_5.5.1_3.0_1734302472419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("rephrase_ai","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("rephrase_ai", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rephrase_ai| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/sahilselokar/RePhrase-Ai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-rephrase_ai_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-rephrase_ai_pipeline_en.md new file mode 100644 index 00000000000000..35b57804384ed8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-rephrase_ai_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English rephrase_ai_pipeline pipeline T5Transformer from sahilselokar +author: John Snow Labs +name: rephrase_ai_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rephrase_ai_pipeline` is a English model originally trained by sahilselokar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rephrase_ai_pipeline_en_5.5.1_3.0_1734302524151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rephrase_ai_pipeline_en_5.5.1_3.0_1734302524151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("rephrase_ai_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("rephrase_ai_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rephrase_ai_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/sahilselokar/RePhrase-Ai + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-results_glebnonegolubin_en.md b/docs/_posts/ahmedlone127/2024-12-15-results_glebnonegolubin_en.md new file mode 100644 index 00000000000000..c8803b35a34e56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-results_glebnonegolubin_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English results_glebnonegolubin T5Transformer from GlebNoNeGolubin +author: John Snow Labs +name: results_glebnonegolubin +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_glebnonegolubin` is a English model originally trained by GlebNoNeGolubin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_glebnonegolubin_en_5.5.1_3.0_1734301450263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_glebnonegolubin_en_5.5.1_3.0_1734301450263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("results_glebnonegolubin","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("results_glebnonegolubin", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_glebnonegolubin| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/GlebNoNeGolubin/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-results_priyankrathore_en.md b/docs/_posts/ahmedlone127/2024-12-15-results_priyankrathore_en.md new file mode 100644 index 00000000000000..d44773ee918371 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-results_priyankrathore_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English results_priyankrathore T5Transformer from priyankrathore +author: John Snow Labs +name: results_priyankrathore +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_priyankrathore` is a English model originally trained by priyankrathore. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_priyankrathore_en_5.5.1_3.0_1734299572367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_priyankrathore_en_5.5.1_3.0_1734299572367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("results_priyankrathore","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("results_priyankrathore", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_priyankrathore| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|341.1 MB| + +## References + +https://huggingface.co/priyankrathore/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-results_priyankrathore_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-results_priyankrathore_pipeline_en.md new file mode 100644 index 00000000000000..b757a095dde840 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-results_priyankrathore_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English results_priyankrathore_pipeline pipeline T5Transformer from priyankrathore +author: John Snow Labs +name: results_priyankrathore_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_priyankrathore_pipeline` is a English model originally trained by priyankrathore. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_priyankrathore_pipeline_en_5.5.1_3.0_1734299593371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_priyankrathore_pipeline_en_5.5.1_3.0_1734299593371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("results_priyankrathore_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("results_priyankrathore_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_priyankrathore_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|341.1 MB| + +## References + +https://huggingface.co/priyankrathore/results + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ag_news_202310232117_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ag_news_202310232117_en.md new file mode 100644 index 00000000000000..e4f5d386fd06e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ag_news_202310232117_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English roberta_base_ag_news_202310232117 RoBertaForSequenceClassification from DaymonQu +author: John Snow Labs +name: roberta_base_ag_news_202310232117 +date: 2024-12-15 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_202310232117` is a English model originally trained by DaymonQu. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_202310232117_en_5.5.1_3.0_1734287549244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_202310232117_en_5.5.1_3.0_1734287549244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_202310232117","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_202310232117","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_202310232117| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.9 MB| + +## References + +References + +https://huggingface.co/DaymonQu/roberta-base_ag_news_202310232117 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ag_news_202310232117_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ag_news_202310232117_pipeline_en.md new file mode 100644 index 00000000000000..336017bb239996 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ag_news_202310232117_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_ag_news_202310232117_pipeline pipeline RoBertaForSequenceClassification from AnonymousAuthorConfSubmission +author: John Snow Labs +name: roberta_base_ag_news_202310232117_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_202310232117_pipeline` is a English model originally trained by AnonymousAuthorConfSubmission. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_202310232117_pipeline_en_5.5.1_3.0_1734287577523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_202310232117_pipeline_en_5.5.1_3.0_1734287577523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_ag_news_202310232117_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_ag_news_202310232117_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_202310232117_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|465.0 MB| + +## References + +https://huggingface.co/AnonymousAuthorConfSubmission/roberta-base_ag_news_202310232117 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ai_detection_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ai_detection_en.md new file mode 100644 index 00000000000000..72731a7344db1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ai_detection_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_base_ai_detection RoBertaForSequenceClassification from Varun53 +author: John Snow Labs +name: roberta_base_ai_detection +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ai_detection` is a English model originally trained by Varun53. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ai_detection_en_5.5.1_3.0_1734286829143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ai_detection_en_5.5.1_3.0_1734286829143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ai_detection","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ai_detection", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ai_detection| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.8 MB| + +## References + +https://huggingface.co/Varun53/roberta-base-AI-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ai_detection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ai_detection_pipeline_en.md new file mode 100644 index 00000000000000..ad75a2bbe3ddda --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ai_detection_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_ai_detection_pipeline pipeline RoBertaForSequenceClassification from Varun53 +author: John Snow Labs +name: roberta_base_ai_detection_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ai_detection_pipeline` is a English model originally trained by Varun53. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ai_detection_pipeline_en_5.5.1_3.0_1734286869938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ai_detection_pipeline_en_5.5.1_3.0_1734286869938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_ai_detection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_ai_detection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ai_detection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|430.8 MB| + +## References + +https://huggingface.co/Varun53/roberta-base-AI-detection + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_finetuned_stsb_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_finetuned_stsb_en.md new file mode 100644 index 00000000000000..cda3b583cbe590 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_finetuned_stsb_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_base_finetuned_stsb RoBertaForSequenceClassification from NikoK +author: John Snow Labs +name: roberta_base_finetuned_stsb +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_stsb` is a English model originally trained by NikoK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_stsb_en_5.5.1_3.0_1734286830879.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_stsb_en_5.5.1_3.0_1734286830879.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_stsb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_stsb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_stsb| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.1 MB| + +## References + +https://huggingface.co/NikoK/roberta-base-finetuned-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_finetuned_stsb_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_finetuned_stsb_pipeline_en.md new file mode 100644 index 00000000000000..98a336e78a4bc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_finetuned_stsb_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_finetuned_stsb_pipeline pipeline RoBertaForSequenceClassification from NikoK +author: John Snow Labs +name: roberta_base_finetuned_stsb_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_stsb_pipeline` is a English model originally trained by NikoK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_stsb_pipeline_en_5.5.1_3.0_1734286873029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_stsb_pipeline_en_5.5.1_3.0_1734286873029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_finetuned_stsb_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_finetuned_stsb_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_stsb_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|439.1 MB| + +## References + +https://huggingface.co/NikoK/roberta-base-finetuned-stsb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_immifilms_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_immifilms_en.md new file mode 100644 index 00000000000000..a2be14ba6d9904 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_immifilms_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_base_immifilms RoBertaForSequenceClassification from wenbrau +author: John Snow Labs +name: roberta_base_immifilms +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_immifilms` is a English model originally trained by wenbrau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_immifilms_en_5.5.1_3.0_1734287049087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_immifilms_en_5.5.1_3.0_1734287049087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_immifilms","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_immifilms", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_immifilms| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.2 MB| + +## References + +https://huggingface.co/wenbrau/roberta-base_immifilms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_immifilms_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_immifilms_pipeline_en.md new file mode 100644 index 00000000000000..253a50a50129c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_immifilms_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_immifilms_pipeline pipeline RoBertaForSequenceClassification from wenbrau +author: John Snow Labs +name: roberta_base_immifilms_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_immifilms_pipeline` is a English model originally trained by wenbrau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_immifilms_pipeline_en_5.5.1_3.0_1734287076794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_immifilms_pipeline_en_5.5.1_3.0_1734287076794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_immifilms_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_immifilms_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_immifilms_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|452.2 MB| + +## References + +https://huggingface.co/wenbrau/roberta-base_immifilms + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ours_rundi_1_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ours_rundi_1_en.md new file mode 100644 index 00000000000000..c9907b4bfd1ff2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ours_rundi_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_base_ours_rundi_1 RoBertaForSequenceClassification from SkyR +author: John Snow Labs +name: roberta_base_ours_rundi_1 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ours_rundi_1` is a English model originally trained by SkyR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ours_rundi_1_en_5.5.1_3.0_1734286987503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ours_rundi_1_en_5.5.1_3.0_1734286987503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ours_rundi_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ours_rundi_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ours_rundi_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.3 MB| + +## References + +https://huggingface.co/SkyR/roberta-base-ours-run-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ours_rundi_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ours_rundi_1_pipeline_en.md new file mode 100644 index 00000000000000..bb272619799657 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_base_ours_rundi_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_base_ours_rundi_1_pipeline pipeline RoBertaForSequenceClassification from SkyR +author: John Snow Labs +name: roberta_base_ours_rundi_1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ours_rundi_1_pipeline` is a English model originally trained by SkyR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ours_rundi_1_pipeline_en_5.5.1_3.0_1734287026953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ours_rundi_1_pipeline_en_5.5.1_3.0_1734287026953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_base_ours_rundi_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_base_ours_rundi_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ours_rundi_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|429.3 MB| + +## References + +https://huggingface.co/SkyR/roberta-base-ours-run-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_cls_url_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_cls_url_en.md new file mode 100644 index 00000000000000..342cbd3ab08789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_cls_url_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_cls_url XlmRoBertaForSequenceClassification from RonTon05 +author: John Snow Labs +name: roberta_cls_url +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_cls_url` is a English model originally trained by RonTon05. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_cls_url_en_5.5.1_3.0_1734291307031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_cls_url_en_5.5.1_3.0_1734291307031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("roberta_cls_url","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("roberta_cls_url", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_cls_url| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|890.3 MB| + +## References + +https://huggingface.co/RonTon05/Roberta-CLS-URL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_cls_url_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_cls_url_pipeline_en.md new file mode 100644 index 00000000000000..b1f07a28010efa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_cls_url_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_cls_url_pipeline pipeline XlmRoBertaForSequenceClassification from RonTon05 +author: John Snow Labs +name: roberta_cls_url_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_cls_url_pipeline` is a English model originally trained by RonTon05. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_cls_url_pipeline_en_5.5.1_3.0_1734291394234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_cls_url_pipeline_en_5.5.1_3.0_1734291394234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_cls_url_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_cls_url_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_cls_url_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|890.3 MB| + +## References + +https://huggingface.co/RonTon05/Roberta-CLS-URL + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_sentimental_analysis_v1_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_sentimental_analysis_v1_en.md new file mode 100644 index 00000000000000..7176d0a0d344bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_sentimental_analysis_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_sentimental_analysis_v1 RoBertaForSequenceClassification from syedkhalid076 +author: John Snow Labs +name: roberta_sentimental_analysis_v1 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentimental_analysis_v1` is a English model originally trained by syedkhalid076. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentimental_analysis_v1_en_5.5.1_3.0_1734286827634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentimental_analysis_v1_en_5.5.1_3.0_1734286827634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentimental_analysis_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentimental_analysis_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentimental_analysis_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.6 MB| + +## References + +https://huggingface.co/syedkhalid076/RoBERTa-Sentimental-Analysis-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_sentimental_analysis_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_sentimental_analysis_v1_pipeline_en.md new file mode 100644 index 00000000000000..d821bcf3c49a29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_sentimental_analysis_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_sentimental_analysis_v1_pipeline pipeline RoBertaForSequenceClassification from syedkhalid076 +author: John Snow Labs +name: roberta_sentimental_analysis_v1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentimental_analysis_v1_pipeline` is a English model originally trained by syedkhalid076. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentimental_analysis_v1_pipeline_en_5.5.1_3.0_1734286868123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentimental_analysis_v1_pipeline_en_5.5.1_3.0_1734286868123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_sentimental_analysis_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_sentimental_analysis_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentimental_analysis_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|443.7 MB| + +## References + +https://huggingface.co/syedkhalid076/RoBERTa-Sentimental-Analysis-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding100model_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding100model_en.md new file mode 100644 index 00000000000000..dfe96a33cc582e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding100model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_sst5_padding100model RoBertaForSequenceClassification from Realgon +author: John Snow Labs +name: roberta_sst5_padding100model +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sst5_padding100model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding100model_en_5.5.1_3.0_1734287400613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding100model_en_5.5.1_3.0_1734287400613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sst5_padding100model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sst5_padding100model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sst5_padding100model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.4 MB| + +## References + +https://huggingface.co/Realgon/roberta_sst5_padding100model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding100model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding100model_pipeline_en.md new file mode 100644 index 00000000000000..fd953bd3e9bc49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding100model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_sst5_padding100model_pipeline pipeline RoBertaForSequenceClassification from Realgon +author: John Snow Labs +name: roberta_sst5_padding100model_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sst5_padding100model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding100model_pipeline_en_5.5.1_3.0_1734287432636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding100model_pipeline_en_5.5.1_3.0_1734287432636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_sst5_padding100model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_sst5_padding100model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sst5_padding100model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|439.4 MB| + +## References + +https://huggingface.co/Realgon/roberta_sst5_padding100model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding30model_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding30model_en.md new file mode 100644 index 00000000000000..445d2a205ea4d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding30model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_sst5_padding30model RoBertaForSequenceClassification from Realgon +author: John Snow Labs +name: roberta_sst5_padding30model +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sst5_padding30model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding30model_en_5.5.1_3.0_1734287033367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding30model_en_5.5.1_3.0_1734287033367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sst5_padding30model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sst5_padding30model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sst5_padding30model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.2 MB| + +## References + +https://huggingface.co/Realgon/roberta_sst5_padding30model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding30model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding30model_pipeline_en.md new file mode 100644 index 00000000000000..6209cca0a597cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_sst5_padding30model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_sst5_padding30model_pipeline pipeline RoBertaForSequenceClassification from Realgon +author: John Snow Labs +name: roberta_sst5_padding30model_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sst5_padding30model_pipeline` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding30model_pipeline_en_5.5.1_3.0_1734287066581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sst5_padding30model_pipeline_en_5.5.1_3.0_1734287066581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_sst5_padding30model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_sst5_padding30model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sst5_padding30model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|439.2 MB| + +## References + +https://huggingface.co/Realgon/roberta_sst5_padding30model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_toxic_classifier_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_toxic_classifier_en.md new file mode 100644 index 00000000000000..1ce909ce6e60ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_toxic_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English roberta_toxic_classifier RoBertaForSequenceClassification from pt-sk +author: John Snow Labs +name: roberta_toxic_classifier +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_toxic_classifier` is a English model originally trained by pt-sk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_toxic_classifier_en_5.5.1_3.0_1734287700700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_toxic_classifier_en_5.5.1_3.0_1734287700700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxic_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxic_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_toxic_classifier| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/pt-sk/roberta_toxic_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-roberta_toxic_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-roberta_toxic_classifier_pipeline_en.md new file mode 100644 index 00000000000000..c6b5e52d4d964e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-roberta_toxic_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English roberta_toxic_classifier_pipeline pipeline RoBertaForSequenceClassification from pt-sk +author: John Snow Labs +name: roberta_toxic_classifier_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_toxic_classifier_pipeline` is a English model originally trained by pt-sk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_toxic_classifier_pipeline_en_5.5.1_3.0_1734287725751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_toxic_classifier_pipeline_en_5.5.1_3.0_1734287725751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("roberta_toxic_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("roberta_toxic_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_toxic_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|468.0 MB| + +## References + +https://huggingface.co/pt-sk/roberta_toxic_classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-rubert_tiny2_finetuned_fintech_en.md b/docs/_posts/ahmedlone127/2024-12-15-rubert_tiny2_finetuned_fintech_en.md new file mode 100644 index 00000000000000..adc8373bab9d19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-rubert_tiny2_finetuned_fintech_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English rubert_tiny2_finetuned_fintech BertEmbeddings from Pastushoc +author: John Snow Labs +name: rubert_tiny2_finetuned_fintech +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_finetuned_fintech` is a English model originally trained by Pastushoc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_fintech_en_5.5.1_3.0_1734284150086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_fintech_en_5.5.1_3.0_1734284150086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = BertEmbeddings.pretrained("rubert_tiny2_finetuned_fintech","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = BertEmbeddings.pretrained("rubert_tiny2_finetuned_fintech","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_finetuned_fintech| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[bert]| +|Language:|en| +|Size:|108.8 MB| + +## References + +https://huggingface.co/Pastushoc/rubert-tiny2-finetuned-fintech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-rubert_tiny2_finetuned_fintech_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-rubert_tiny2_finetuned_fintech_pipeline_en.md new file mode 100644 index 00000000000000..ed912eebcfc369 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-rubert_tiny2_finetuned_fintech_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English rubert_tiny2_finetuned_fintech_pipeline pipeline BertEmbeddings from Pastushoc +author: John Snow Labs +name: rubert_tiny2_finetuned_fintech_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_finetuned_fintech_pipeline` is a English model originally trained by Pastushoc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_fintech_pipeline_en_5.5.1_3.0_1734284155347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_fintech_pipeline_en_5.5.1_3.0_1734284155347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("rubert_tiny2_finetuned_fintech_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("rubert_tiny2_finetuned_fintech_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_finetuned_fintech_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|108.9 MB| + +## References + +https://huggingface.co/Pastushoc/rubert-tiny2-finetuned-fintech + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_4_data_english_cardiff_eng_only4_en.md b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_4_data_english_cardiff_eng_only4_en.md new file mode 100644 index 00000000000000..86cf30343d4006 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_4_data_english_cardiff_eng_only4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English scenario_tcr_4_data_english_cardiff_eng_only4 XlmRoBertaForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_4_data_english_cardiff_eng_only4 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_4_data_english_cardiff_eng_only4` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_4_data_english_cardiff_eng_only4_en_5.5.1_3.0_1734292520220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_4_data_english_cardiff_eng_only4_en_5.5.1_3.0_1734292520220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("scenario_tcr_4_data_english_cardiff_eng_only4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("scenario_tcr_4_data_english_cardiff_eng_only4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_4_data_english_cardiff_eng_only4| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|821.2 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR-4_data-en-cardiff_eng_only4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_4_data_english_cardiff_eng_only4_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_4_data_english_cardiff_eng_only4_pipeline_en.md new file mode 100644 index 00000000000000..1ccda122caeddd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_4_data_english_cardiff_eng_only4_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English scenario_tcr_4_data_english_cardiff_eng_only4_pipeline pipeline XlmRoBertaForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_4_data_english_cardiff_eng_only4_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_4_data_english_cardiff_eng_only4_pipeline` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_4_data_english_cardiff_eng_only4_pipeline_en_5.5.1_3.0_1734292625209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_4_data_english_cardiff_eng_only4_pipeline_en_5.5.1_3.0_1734292625209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("scenario_tcr_4_data_english_cardiff_eng_only4_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("scenario_tcr_4_data_english_cardiff_eng_only4_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_4_data_english_cardiff_eng_only4_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|821.2 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR-4_data-en-cardiff_eng_only4 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_data_cl_cardiff_cl_only3_en.md b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_data_cl_cardiff_cl_only3_en.md new file mode 100644 index 00000000000000..e3048ea0dbdcee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_data_cl_cardiff_cl_only3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English scenario_tcr_data_cl_cardiff_cl_only3 XlmRoBertaForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_data_cl_cardiff_cl_only3 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_data_cl_cardiff_cl_only3` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_cl_cardiff_cl_only3_en_5.5.1_3.0_1734292681694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_cl_cardiff_cl_only3_en_5.5.1_3.0_1734292681694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("scenario_tcr_data_cl_cardiff_cl_only3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("scenario_tcr_data_cl_cardiff_cl_only3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_data_cl_cardiff_cl_only3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|850.1 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR_data-cl-cardiff_cl_only3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_data_cl_cardiff_cl_only3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_data_cl_cardiff_cl_only3_pipeline_en.md new file mode 100644 index 00000000000000..0c4881034f38f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-scenario_tcr_data_cl_cardiff_cl_only3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English scenario_tcr_data_cl_cardiff_cl_only3_pipeline pipeline XlmRoBertaForSequenceClassification from haryoaw +author: John Snow Labs +name: scenario_tcr_data_cl_cardiff_cl_only3_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_tcr_data_cl_cardiff_cl_only3_pipeline` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_cl_cardiff_cl_only3_pipeline_en_5.5.1_3.0_1734292771410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_tcr_data_cl_cardiff_cl_only3_pipeline_en_5.5.1_3.0_1734292771410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("scenario_tcr_data_cl_cardiff_cl_only3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("scenario_tcr_data_cl_cardiff_cl_only3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_tcr_data_cl_cardiff_cl_only3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|850.1 MB| + +## References + +https://huggingface.co/haryoaw/scenario-TCR_data-cl-cardiff_cl_only3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-semantic_embedding_2_en.md b/docs/_posts/ahmedlone127/2024-12-15-semantic_embedding_2_en.md new file mode 100644 index 00000000000000..6ae6db061bcf32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-semantic_embedding_2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English semantic_embedding_2 MPNetEmbeddings from myfi +author: John Snow Labs +name: semantic_embedding_2 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`semantic_embedding_2` is a English model originally trained by myfi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/semantic_embedding_2_en_5.5.1_3.0_1734306396815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/semantic_embedding_2_en_5.5.1_3.0_1734306396815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("semantic_embedding_2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("semantic_embedding_2","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|semantic_embedding_2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|382.7 MB| + +## References + +https://huggingface.co/myfi/semantic-embedding_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-semantic_embedding_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-semantic_embedding_2_pipeline_en.md new file mode 100644 index 00000000000000..1265424b530f80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-semantic_embedding_2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English semantic_embedding_2_pipeline pipeline MPNetEmbeddings from myfi +author: John Snow Labs +name: semantic_embedding_2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`semantic_embedding_2_pipeline` is a English model originally trained by myfi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/semantic_embedding_2_pipeline_en_5.5.1_3.0_1734306426525.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/semantic_embedding_2_pipeline_en_5.5.1_3.0_1734306426525.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("semantic_embedding_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("semantic_embedding_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|semantic_embedding_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|382.7 MB| + +## References + +https://huggingface.co/myfi/semantic-embedding_2 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_arabic_german_english_indonesian_japanese_wikidump_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_arabic_german_english_indonesian_japanese_wikidump_en.md new file mode 100644 index 00000000000000..c58173e48553ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_arabic_german_english_indonesian_japanese_wikidump_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_arabic_german_english_indonesian_japanese_wikidump BertSentenceEmbeddings from dehanalkautsar +author: John Snow Labs +name: sent_bert_arabic_german_english_indonesian_japanese_wikidump +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_arabic_german_english_indonesian_japanese_wikidump` is a English model originally trained by dehanalkautsar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_arabic_german_english_indonesian_japanese_wikidump_en_5.5.1_3.0_1734285013696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_arabic_german_english_indonesian_japanese_wikidump_en_5.5.1_3.0_1734285013696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_arabic_german_english_indonesian_japanese_wikidump","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_arabic_german_english_indonesian_japanese_wikidump","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_arabic_german_english_indonesian_japanese_wikidump| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/dehanalkautsar/bert_ar_de_en_id_ja_wikidump \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en.md new file mode 100644 index 00000000000000..149e3e99fc31e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline pipeline BertSentenceEmbeddings from dehanalkautsar +author: John Snow Labs +name: sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline` is a English model originally trained by dehanalkautsar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en_5.5.1_3.0_1734285035246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline_en_5.5.1_3.0_1734285035246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_arabic_german_english_indonesian_japanese_wikidump_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.7 MB| + +## References + +https://huggingface.co/dehanalkautsar/bert_ar_de_en_id_ja_wikidump + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_multilingual_cased_finetuned_hindi_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_multilingual_cased_finetuned_hindi_pipeline_xx.md new file mode 100644 index 00000000000000..43a8b3e9f44f9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_multilingual_cased_finetuned_hindi_pipeline_xx.md @@ -0,0 +1,71 @@ +--- +layout: model +title: Multilingual sent_bert_base_multilingual_cased_finetuned_hindi_pipeline pipeline BertSentenceEmbeddings from pbwinter +author: John Snow Labs +name: sent_bert_base_multilingual_cased_finetuned_hindi_pipeline +date: 2024-12-15 +tags: [xx, open_source, pipeline, onnx] +task: Embeddings +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_multilingual_cased_finetuned_hindi_pipeline` is a Multilingual model originally trained by pbwinter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_multilingual_cased_finetuned_hindi_pipeline_xx_5.5.1_3.0_1734285285428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_multilingual_cased_finetuned_hindi_pipeline_xx_5.5.1_3.0_1734285285428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_base_multilingual_cased_finetuned_hindi_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_base_multilingual_cased_finetuned_hindi_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_multilingual_cased_finetuned_hindi_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|665.6 MB| + +## References + +https://huggingface.co/pbwinter/bert-base-multilingual-cased-finetuned-hindi + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_multilingual_cased_finetuned_hindi_xx.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_multilingual_cased_finetuned_hindi_xx.md new file mode 100644 index 00000000000000..3b1e1a724d70e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_multilingual_cased_finetuned_hindi_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual sent_bert_base_multilingual_cased_finetuned_hindi BertSentenceEmbeddings from pbwinter +author: John Snow Labs +name: sent_bert_base_multilingual_cased_finetuned_hindi +date: 2024-12-15 +tags: [xx, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_multilingual_cased_finetuned_hindi` is a Multilingual model originally trained by pbwinter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_multilingual_cased_finetuned_hindi_xx_5.5.1_3.0_1734285249040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_multilingual_cased_finetuned_hindi_xx_5.5.1_3.0_1734285249040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_multilingual_cased_finetuned_hindi","xx") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_multilingual_cased_finetuned_hindi","xx") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_multilingual_cased_finetuned_hindi| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/pbwinter/bert-base-multilingual-cased-finetuned-hindi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_uncased_embedding_relative_key_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_uncased_embedding_relative_key_en.md new file mode 100644 index 00000000000000..00689d2ffcc9d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_uncased_embedding_relative_key_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_base_uncased_embedding_relative_key BertSentenceEmbeddings from zhiheng-huang +author: John Snow Labs +name: sent_bert_base_uncased_embedding_relative_key +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_uncased_embedding_relative_key` is a English model originally trained by zhiheng-huang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_embedding_relative_key_en_5.5.1_3.0_1734285865537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_embedding_relative_key_en_5.5.1_3.0_1734285865537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_uncased_embedding_relative_key","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_uncased_embedding_relative_key","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_uncased_embedding_relative_key| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/zhiheng-huang/bert-base-uncased-embedding-relative-key \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_uncased_embedding_relative_key_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_uncased_embedding_relative_key_pipeline_en.md new file mode 100644 index 00000000000000..41548131409e9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_base_uncased_embedding_relative_key_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_base_uncased_embedding_relative_key_pipeline pipeline BertSentenceEmbeddings from zhiheng-huang +author: John Snow Labs +name: sent_bert_base_uncased_embedding_relative_key_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_base_uncased_embedding_relative_key_pipeline` is a English model originally trained by zhiheng-huang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_embedding_relative_key_pipeline_en_5.5.1_3.0_1734285887382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_base_uncased_embedding_relative_key_pipeline_en_5.5.1_3.0_1734285887382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_base_uncased_embedding_relative_key_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_base_uncased_embedding_relative_key_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_base_uncased_embedding_relative_key_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/zhiheng-huang/bert-base-uncased-embedding-relative-key + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_en.md new file mode 100644 index 00000000000000..c21eea027900fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query BertSentenceEmbeddings from zhiheng-huang +author: John Snow Labs +name: sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query` is a English model originally trained by zhiheng-huang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_en_5.5.1_3.0_1734285885232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_en_5.5.1_3.0_1734285885232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/zhiheng-huang/bert-large-uncased-whole-word-masking-embedding-relative-key-query \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline_en.md new file mode 100644 index 00000000000000..913dc182f2b7a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline pipeline BertSentenceEmbeddings from zhiheng-huang +author: John Snow Labs +name: sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline` is a English model originally trained by zhiheng-huang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline_en_5.5.1_3.0_1734285956932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline_en_5.5.1_3.0_1734285956932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_large_uncased_whole_word_masking_embedding_relative_key_query_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/zhiheng-huang/bert-large-uncased-whole-word-masking-embedding-relative-key-query + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_small_juman_unigram_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_small_juman_unigram_en.md new file mode 100644 index 00000000000000..9afbb2e36602fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_small_juman_unigram_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_small_juman_unigram BertSentenceEmbeddings from schnell +author: John Snow Labs +name: sent_bert_small_juman_unigram +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_small_juman_unigram` is a English model originally trained by schnell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_small_juman_unigram_en_5.5.1_3.0_1734285358936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_small_juman_unigram_en_5.5.1_3.0_1734285358936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_small_juman_unigram","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_small_juman_unigram","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_small_juman_unigram| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|109.8 MB| + +## References + +https://huggingface.co/schnell/bert-small-juman-unigram \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_small_juman_unigram_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_small_juman_unigram_pipeline_en.md new file mode 100644 index 00000000000000..ea1bd88e3cc1c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_small_juman_unigram_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_small_juman_unigram_pipeline pipeline BertSentenceEmbeddings from schnell +author: John Snow Labs +name: sent_bert_small_juman_unigram_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_small_juman_unigram_pipeline` is a English model originally trained by schnell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_small_juman_unigram_pipeline_en_5.5.1_3.0_1734285364374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_small_juman_unigram_pipeline_en_5.5.1_3.0_1734285364374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_small_juman_unigram_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_small_juman_unigram_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_small_juman_unigram_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|110.3 MB| + +## References + +https://huggingface.co/schnell/bert-small-juman-unigram + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_web_bulgarian_bg.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_web_bulgarian_bg.md new file mode 100644 index 00000000000000..d5f01a7de4815b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_web_bulgarian_bg.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Bulgarian sent_bert_web_bulgarian BertSentenceEmbeddings from usmiva +author: John Snow Labs +name: sent_bert_web_bulgarian +date: 2024-12-15 +tags: [bg, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: bg +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_web_bulgarian` is a Bulgarian model originally trained by usmiva. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_web_bulgarian_bg_5.5.1_3.0_1734285741435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_web_bulgarian_bg_5.5.1_3.0_1734285741435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_web_bulgarian","bg") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_web_bulgarian","bg") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_web_bulgarian| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|bg| +|Size:|406.9 MB| + +## References + +https://huggingface.co/usmiva/bert-web-bg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_web_bulgarian_pipeline_bg.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_web_bulgarian_pipeline_bg.md new file mode 100644 index 00000000000000..5c947cb5b38a22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_web_bulgarian_pipeline_bg.md @@ -0,0 +1,71 @@ +--- +layout: model +title: Bulgarian sent_bert_web_bulgarian_pipeline pipeline BertSentenceEmbeddings from usmiva +author: John Snow Labs +name: sent_bert_web_bulgarian_pipeline +date: 2024-12-15 +tags: [bg, open_source, pipeline, onnx] +task: Embeddings +language: bg +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_web_bulgarian_pipeline` is a Bulgarian model originally trained by usmiva. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_web_bulgarian_pipeline_bg_5.5.1_3.0_1734285763078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_web_bulgarian_pipeline_bg_5.5.1_3.0_1734285763078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_web_bulgarian_pipeline", lang = "bg") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_web_bulgarian_pipeline", lang = "bg") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_web_bulgarian_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|bg| +|Size:|407.4 MB| + +## References + +https://huggingface.co/usmiva/bert-web-bg + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_en.md new file mode 100644 index 00000000000000..716e220fb34af8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1 BertSentenceEmbeddings from psktoure +author: John Snow Labs +name: sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1 +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1` is a English model originally trained by psktoure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_en_5.5.1_3.0_1734285610785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_en_5.5.1_3.0_1734285610785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/psktoure/BERT_WordPiece_phonetic_cleaned_wikitext-103-raw-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline_en.md new file mode 100644 index 00000000000000..c29bcd6ec7e488 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline pipeline BertSentenceEmbeddings from psktoure +author: John Snow Labs +name: sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline` is a English model originally trained by psktoure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline_en_5.5.1_3.0_1734285634382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline_en_5.5.1_3.0_1734285634382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_wordpiece_phonetic_cleaned_wikitext_103_raw_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.9 MB| + +## References + +https://huggingface.co/psktoure/BERT_WordPiece_phonetic_cleaned_wikitext-103-raw-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_wikitext_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_wikitext_en.md new file mode 100644 index 00000000000000..d9b3bbfa9b1714 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_wikitext_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_bert_wordpiece_wikitext BertSentenceEmbeddings from psktoure +author: John Snow Labs +name: sent_bert_wordpiece_wikitext +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_wordpiece_wikitext` is a English model originally trained by psktoure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_wikitext_en_5.5.1_3.0_1734285272056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_wikitext_en_5.5.1_3.0_1734285272056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_bert_wordpiece_wikitext","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_bert_wordpiece_wikitext","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_wordpiece_wikitext| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/psktoure/BERT_WordPiece_wikitext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_wikitext_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_wikitext_pipeline_en.md new file mode 100644 index 00000000000000..13b7b09d040a14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_bert_wordpiece_wikitext_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_bert_wordpiece_wikitext_pipeline pipeline BertSentenceEmbeddings from psktoure +author: John Snow Labs +name: sent_bert_wordpiece_wikitext_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_bert_wordpiece_wikitext_pipeline` is a English model originally trained by psktoure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_wikitext_pipeline_en_5.5.1_3.0_1734285293700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_bert_wordpiece_wikitext_pipeline_en_5.5.1_3.0_1734285293700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_bert_wordpiece_wikitext_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_bert_wordpiece_wikitext_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_bert_wordpiece_wikitext_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.8 MB| + +## References + +https://huggingface.co/psktoure/BERT_WordPiece_wikitext + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_betelgeuse_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_betelgeuse_bert_base_uncased_en.md new file mode 100644 index 00000000000000..702bcc606c1007 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_betelgeuse_bert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_betelgeuse_bert_base_uncased BertSentenceEmbeddings from prithivMLmods +author: John Snow Labs +name: sent_betelgeuse_bert_base_uncased +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_betelgeuse_bert_base_uncased` is a English model originally trained by prithivMLmods. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_betelgeuse_bert_base_uncased_en_5.5.1_3.0_1734285477489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_betelgeuse_bert_base_uncased_en_5.5.1_3.0_1734285477489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_betelgeuse_bert_base_uncased","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_betelgeuse_bert_base_uncased","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_betelgeuse_bert_base_uncased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/prithivMLmods/Betelgeuse-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_betelgeuse_bert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_betelgeuse_bert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..010d40cbbb1876 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_betelgeuse_bert_base_uncased_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_betelgeuse_bert_base_uncased_pipeline pipeline BertSentenceEmbeddings from prithivMLmods +author: John Snow Labs +name: sent_betelgeuse_bert_base_uncased_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_betelgeuse_bert_base_uncased_pipeline` is a English model originally trained by prithivMLmods. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_betelgeuse_bert_base_uncased_pipeline_en_5.5.1_3.0_1734285499869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_betelgeuse_bert_base_uncased_pipeline_en_5.5.1_3.0_1734285499869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_betelgeuse_bert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_betelgeuse_bert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_betelgeuse_bert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.7 MB| + +## References + +https://huggingface.co/prithivMLmods/Betelgeuse-bert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_clinicalbert_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_clinicalbert_en.md new file mode 100644 index 00000000000000..6d1687c628c482 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_clinicalbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_clinicalbert BertSentenceEmbeddings from nazyrova +author: John Snow Labs +name: sent_clinicalbert +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_clinicalbert` is a English model originally trained by nazyrova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_clinicalbert_en_5.5.1_3.0_1734285999442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_clinicalbert_en_5.5.1_3.0_1734285999442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_clinicalbert","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_clinicalbert","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_clinicalbert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.5 MB| + +## References + +https://huggingface.co/nazyrova/clinicalBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_clinicalbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_clinicalbert_pipeline_en.md new file mode 100644 index 00000000000000..b25b9d687d927f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_clinicalbert_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_clinicalbert_pipeline pipeline BertSentenceEmbeddings from nazyrova +author: John Snow Labs +name: sent_clinicalbert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_clinicalbert_pipeline` is a English model originally trained by nazyrova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_clinicalbert_pipeline_en_5.5.1_3.0_1734286022357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_clinicalbert_pipeline_en_5.5.1_3.0_1734286022357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_clinicalbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_clinicalbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_clinicalbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|413.1 MB| + +## References + +https://huggingface.co/nazyrova/clinicalBERT + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_covid_vaccine_twitter_bert_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_covid_vaccine_twitter_bert_en.md new file mode 100644 index 00000000000000..95cf88325aded5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_covid_vaccine_twitter_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_covid_vaccine_twitter_bert BertSentenceEmbeddings from GateNLP +author: John Snow Labs +name: sent_covid_vaccine_twitter_bert +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_covid_vaccine_twitter_bert` is a English model originally trained by GateNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_covid_vaccine_twitter_bert_en_5.5.1_3.0_1734285957393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_covid_vaccine_twitter_bert_en_5.5.1_3.0_1734285957393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_covid_vaccine_twitter_bert","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_covid_vaccine_twitter_bert","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_covid_vaccine_twitter_bert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/GateNLP/covid-vaccine-twitter-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_covid_vaccine_twitter_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_covid_vaccine_twitter_bert_pipeline_en.md new file mode 100644 index 00000000000000..01e02ee6c390a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_covid_vaccine_twitter_bert_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_covid_vaccine_twitter_bert_pipeline pipeline BertSentenceEmbeddings from GateNLP +author: John Snow Labs +name: sent_covid_vaccine_twitter_bert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_covid_vaccine_twitter_bert_pipeline` is a English model originally trained by GateNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_covid_vaccine_twitter_bert_pipeline_en_5.5.1_3.0_1734286036635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_covid_vaccine_twitter_bert_pipeline_en_5.5.1_3.0_1734286036635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_covid_vaccine_twitter_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_covid_vaccine_twitter_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_covid_vaccine_twitter_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/GateNLP/covid-vaccine-twitter-bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_germanfinbert_sardinian_de.md b/docs/_posts/ahmedlone127/2024-12-15-sent_germanfinbert_sardinian_de.md new file mode 100644 index 00000000000000..ecf170919299e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_germanfinbert_sardinian_de.md @@ -0,0 +1,94 @@ +--- +layout: model +title: German sent_germanfinbert_sardinian BertSentenceEmbeddings from scherrmann +author: John Snow Labs +name: sent_germanfinbert_sardinian +date: 2024-12-15 +tags: [de, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: de +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_germanfinbert_sardinian` is a German model originally trained by scherrmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_germanfinbert_sardinian_de_5.5.1_3.0_1734284917689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_germanfinbert_sardinian_de_5.5.1_3.0_1734284917689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_germanfinbert_sardinian","de") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_germanfinbert_sardinian","de") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_germanfinbert_sardinian| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|405.8 MB| + +## References + +https://huggingface.co/scherrmann/GermanFinBert_SC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_germanfinbert_sardinian_pipeline_de.md b/docs/_posts/ahmedlone127/2024-12-15-sent_germanfinbert_sardinian_pipeline_de.md new file mode 100644 index 00000000000000..63b372c8dc92f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_germanfinbert_sardinian_pipeline_de.md @@ -0,0 +1,71 @@ +--- +layout: model +title: German sent_germanfinbert_sardinian_pipeline pipeline BertSentenceEmbeddings from scherrmann +author: John Snow Labs +name: sent_germanfinbert_sardinian_pipeline +date: 2024-12-15 +tags: [de, open_source, pipeline, onnx] +task: Embeddings +language: de +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_germanfinbert_sardinian_pipeline` is a German model originally trained by scherrmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_germanfinbert_sardinian_pipeline_de_5.5.1_3.0_1734284939699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_germanfinbert_sardinian_pipeline_de_5.5.1_3.0_1734284939699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_germanfinbert_sardinian_pipeline", lang = "de") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_germanfinbert_sardinian_pipeline", lang = "de") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_germanfinbert_sardinian_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|de| +|Size:|406.4 MB| + +## References + +https://huggingface.co/scherrmann/GermanFinBert_SC + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_gujiroberta_jian_fan_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_gujiroberta_jian_fan_en.md new file mode 100644 index 00000000000000..b1f628d50eb670 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_gujiroberta_jian_fan_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_gujiroberta_jian_fan BertSentenceEmbeddings from hsc748NLP +author: John Snow Labs +name: sent_gujiroberta_jian_fan +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_gujiroberta_jian_fan` is a English model originally trained by hsc748NLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_gujiroberta_jian_fan_en_5.5.1_3.0_1734285348755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_gujiroberta_jian_fan_en_5.5.1_3.0_1734285348755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_gujiroberta_jian_fan","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_gujiroberta_jian_fan","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_gujiroberta_jian_fan| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|420.2 MB| + +## References + +https://huggingface.co/hsc748NLP/GujiRoBERTa_jian_fan \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_gujiroberta_jian_fan_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_gujiroberta_jian_fan_pipeline_en.md new file mode 100644 index 00000000000000..2f1185613e11f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_gujiroberta_jian_fan_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_gujiroberta_jian_fan_pipeline pipeline BertSentenceEmbeddings from hsc748NLP +author: John Snow Labs +name: sent_gujiroberta_jian_fan_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_gujiroberta_jian_fan_pipeline` is a English model originally trained by hsc748NLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_gujiroberta_jian_fan_pipeline_en_5.5.1_3.0_1734285371406.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_gujiroberta_jian_fan_pipeline_en_5.5.1_3.0_1734285371406.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_gujiroberta_jian_fan_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_gujiroberta_jian_fan_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_gujiroberta_jian_fan_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|420.7 MB| + +## References + +https://huggingface.co/hsc748NLP/GujiRoBERTa_jian_fan + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_kinyabert_large_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_kinyabert_large_en.md new file mode 100644 index 00000000000000..e1d5943c03495c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_kinyabert_large_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_kinyabert_large BertSentenceEmbeddings from jean-paul +author: John Snow Labs +name: sent_kinyabert_large +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_kinyabert_large` is a English model originally trained by jean-paul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_kinyabert_large_en_5.5.1_3.0_1734285641157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_kinyabert_large_en_5.5.1_3.0_1734285641157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_kinyabert_large","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_kinyabert_large","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_kinyabert_large| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/jean-paul/KinyaBERT-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_kinyabert_large_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_kinyabert_large_pipeline_en.md new file mode 100644 index 00000000000000..d1ae524645d107 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_kinyabert_large_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_kinyabert_large_pipeline pipeline BertSentenceEmbeddings from jean-paul +author: John Snow Labs +name: sent_kinyabert_large_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_kinyabert_large_pipeline` is a English model originally trained by jean-paul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_kinyabert_large_pipeline_en_5.5.1_3.0_1734285663956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_kinyabert_large_pipeline_en_5.5.1_3.0_1734285663956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_kinyabert_large_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_kinyabert_large_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_kinyabert_large_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/jean-paul/KinyaBERT-large + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_lsg_legal_small_uncased_4096_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_lsg_legal_small_uncased_4096_en.md new file mode 100644 index 00000000000000..dc2620a5802653 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_lsg_legal_small_uncased_4096_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_lsg_legal_small_uncased_4096 BertSentenceEmbeddings from ccdv +author: John Snow Labs +name: sent_lsg_legal_small_uncased_4096 +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_lsg_legal_small_uncased_4096` is a English model originally trained by ccdv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_lsg_legal_small_uncased_4096_en_5.5.1_3.0_1734284862187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_lsg_legal_small_uncased_4096_en_5.5.1_3.0_1734284862187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_lsg_legal_small_uncased_4096","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_lsg_legal_small_uncased_4096","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_lsg_legal_small_uncased_4096| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|137.4 MB| + +## References + +https://huggingface.co/ccdv/lsg-legal-small-uncased-4096 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_lsg_legal_small_uncased_4096_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_lsg_legal_small_uncased_4096_pipeline_en.md new file mode 100644 index 00000000000000..9b4e297a4789bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_lsg_legal_small_uncased_4096_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_lsg_legal_small_uncased_4096_pipeline pipeline BertSentenceEmbeddings from ccdv +author: John Snow Labs +name: sent_lsg_legal_small_uncased_4096_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_lsg_legal_small_uncased_4096_pipeline` is a English model originally trained by ccdv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_lsg_legal_small_uncased_4096_pipeline_en_5.5.1_3.0_1734284869025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_lsg_legal_small_uncased_4096_pipeline_en_5.5.1_3.0_1734284869025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_lsg_legal_small_uncased_4096_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_lsg_legal_small_uncased_4096_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_lsg_legal_small_uncased_4096_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|138.0 MB| + +## References + +https://huggingface.co/ccdv/lsg-legal-small-uncased-4096 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_m2_bert_32k_retrieval_encoder_v1_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_m2_bert_32k_retrieval_encoder_v1_en.md new file mode 100644 index 00000000000000..b6d7ffc210d860 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_m2_bert_32k_retrieval_encoder_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_m2_bert_32k_retrieval_encoder_v1 BertSentenceEmbeddings from hazyresearch +author: John Snow Labs +name: sent_m2_bert_32k_retrieval_encoder_v1 +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_m2_bert_32k_retrieval_encoder_v1` is a English model originally trained by hazyresearch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_m2_bert_32k_retrieval_encoder_v1_en_5.5.1_3.0_1734285434661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_m2_bert_32k_retrieval_encoder_v1_en_5.5.1_3.0_1734285434661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_m2_bert_32k_retrieval_encoder_v1","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_m2_bert_32k_retrieval_encoder_v1","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_m2_bert_32k_retrieval_encoder_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|499.0 MB| + +## References + +https://huggingface.co/hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_m2_bert_32k_retrieval_encoder_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_m2_bert_32k_retrieval_encoder_v1_pipeline_en.md new file mode 100644 index 00000000000000..3886d03e8a181b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_m2_bert_32k_retrieval_encoder_v1_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_m2_bert_32k_retrieval_encoder_v1_pipeline pipeline BertSentenceEmbeddings from hazyresearch +author: John Snow Labs +name: sent_m2_bert_32k_retrieval_encoder_v1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_m2_bert_32k_retrieval_encoder_v1_pipeline` is a English model originally trained by hazyresearch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_m2_bert_32k_retrieval_encoder_v1_pipeline_en_5.5.1_3.0_1734285462850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_m2_bert_32k_retrieval_encoder_v1_pipeline_en_5.5.1_3.0_1734285462850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_m2_bert_32k_retrieval_encoder_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_m2_bert_32k_retrieval_encoder_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_m2_bert_32k_retrieval_encoder_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|499.5 MB| + +## References + +https://huggingface.co/hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_patentbert_cased_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_patentbert_cased_en.md new file mode 100644 index 00000000000000..3509c156a7e114 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_patentbert_cased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_patentbert_cased BertSentenceEmbeddings from dheerajpai +author: John Snow Labs +name: sent_patentbert_cased +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_patentbert_cased` is a English model originally trained by dheerajpai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_patentbert_cased_en_5.5.1_3.0_1734285626476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_patentbert_cased_en_5.5.1_3.0_1734285626476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_patentbert_cased","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_patentbert_cased","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_patentbert_cased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|75.6 MB| + +## References + +https://huggingface.co/dheerajpai/patentbert-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_patentbert_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_patentbert_cased_pipeline_en.md new file mode 100644 index 00000000000000..757016bb70cb9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_patentbert_cased_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_patentbert_cased_pipeline pipeline BertSentenceEmbeddings from dheerajpai +author: John Snow Labs +name: sent_patentbert_cased_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_patentbert_cased_pipeline` is a English model originally trained by dheerajpai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_patentbert_cased_pipeline_en_5.5.1_3.0_1734285630216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_patentbert_cased_pipeline_en_5.5.1_3.0_1734285630216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_patentbert_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_patentbert_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_patentbert_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|76.1 MB| + +## References + +https://huggingface.co/dheerajpai/patentbert-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_sinhala_bert_medium_v2_pipeline_si.md b/docs/_posts/ahmedlone127/2024-12-15-sent_sinhala_bert_medium_v2_pipeline_si.md new file mode 100644 index 00000000000000..8565a93caf7c13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_sinhala_bert_medium_v2_pipeline_si.md @@ -0,0 +1,71 @@ +--- +layout: model +title: Sinhala, Sinhalese sent_sinhala_bert_medium_v2_pipeline pipeline BertSentenceEmbeddings from Ransaka +author: John Snow Labs +name: sent_sinhala_bert_medium_v2_pipeline +date: 2024-12-15 +tags: [si, open_source, pipeline, onnx] +task: Embeddings +language: si +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_sinhala_bert_medium_v2_pipeline` is a Sinhala, Sinhalese model originally trained by Ransaka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_sinhala_bert_medium_v2_pipeline_si_5.5.1_3.0_1734284875205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_sinhala_bert_medium_v2_pipeline_si_5.5.1_3.0_1734284875205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_sinhala_bert_medium_v2_pipeline", lang = "si") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_sinhala_bert_medium_v2_pipeline", lang = "si") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_sinhala_bert_medium_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|si| +|Size:|187.9 MB| + +## References + +https://huggingface.co/Ransaka/sinhala-bert-medium-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_sinhala_bert_medium_v2_si.md b/docs/_posts/ahmedlone127/2024-12-15-sent_sinhala_bert_medium_v2_si.md new file mode 100644 index 00000000000000..ddbfb5e9c4ba47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_sinhala_bert_medium_v2_si.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Sinhala, Sinhalese sent_sinhala_bert_medium_v2 BertSentenceEmbeddings from Ransaka +author: John Snow Labs +name: sent_sinhala_bert_medium_v2 +date: 2024-12-15 +tags: [si, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: si +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_sinhala_bert_medium_v2` is a Sinhala, Sinhalese model originally trained by Ransaka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_sinhala_bert_medium_v2_si_5.5.1_3.0_1734284864269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_sinhala_bert_medium_v2_si_5.5.1_3.0_1734284864269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_sinhala_bert_medium_v2","si") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_sinhala_bert_medium_v2","si") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_sinhala_bert_medium_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|si| +|Size:|187.4 MB| + +## References + +https://huggingface.co/Ransaka/sinhala-bert-medium-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_bert_turkish_cased_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_bert_turkish_cased_en.md new file mode 100644 index 00000000000000..c9bf697d7a42b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_bert_turkish_cased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_tiny_bert_turkish_cased BertSentenceEmbeddings from uygarkurt +author: John Snow Labs +name: sent_tiny_bert_turkish_cased +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_tiny_bert_turkish_cased` is a English model originally trained by uygarkurt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_tiny_bert_turkish_cased_en_5.5.1_3.0_1734285560150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_tiny_bert_turkish_cased_en_5.5.1_3.0_1734285560150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_tiny_bert_turkish_cased","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_tiny_bert_turkish_cased","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_tiny_bert_turkish_cased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|17.5 MB| + +## References + +https://huggingface.co/uygarkurt/tiny-bert-turkish-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_bert_turkish_cased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_bert_turkish_cased_pipeline_en.md new file mode 100644 index 00000000000000..97bd95353ead51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_bert_turkish_cased_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_tiny_bert_turkish_cased_pipeline pipeline BertSentenceEmbeddings from uygarkurt +author: John Snow Labs +name: sent_tiny_bert_turkish_cased_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_tiny_bert_turkish_cased_pipeline` is a English model originally trained by uygarkurt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_tiny_bert_turkish_cased_pipeline_en_5.5.1_3.0_1734285561554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_tiny_bert_turkish_cased_pipeline_en_5.5.1_3.0_1734285561554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_tiny_bert_turkish_cased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_tiny_bert_turkish_cased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_tiny_bert_turkish_cased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|18.0 MB| + +## References + +https://huggingface.co/uygarkurt/tiny-bert-turkish-cased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_mlm_glue_qqp_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_mlm_glue_qqp_en.md new file mode 100644 index 00000000000000..ed7b53f33534f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_mlm_glue_qqp_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sent_tiny_mlm_glue_qqp BertSentenceEmbeddings from muhtasham +author: John Snow Labs +name: sent_tiny_mlm_glue_qqp +date: 2024-12-15 +tags: [en, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_tiny_mlm_glue_qqp` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_tiny_mlm_glue_qqp_en_5.5.1_3.0_1734285710959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_tiny_mlm_glue_qqp_en_5.5.1_3.0_1734285710959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_tiny_mlm_glue_qqp","en") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_tiny_mlm_glue_qqp","en") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_tiny_mlm_glue_qqp| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/muhtasham/tiny-mlm-glue-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_mlm_glue_qqp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_mlm_glue_qqp_pipeline_en.md new file mode 100644 index 00000000000000..f84fdbb45b070b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_tiny_mlm_glue_qqp_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English sent_tiny_mlm_glue_qqp_pipeline pipeline BertSentenceEmbeddings from muhtasham +author: John Snow Labs +name: sent_tiny_mlm_glue_qqp_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_tiny_mlm_glue_qqp_pipeline` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_tiny_mlm_glue_qqp_pipeline_en_5.5.1_3.0_1734285712149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_tiny_mlm_glue_qqp_pipeline_en_5.5.1_3.0_1734285712149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_tiny_mlm_glue_qqp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_tiny_mlm_glue_qqp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_tiny_mlm_glue_qqp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|17.2 MB| + +## References + +https://huggingface.co/muhtasham/tiny-mlm-glue-qqp + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_turkish_medium_bert_uncased_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-12-15-sent_turkish_medium_bert_uncased_pipeline_tr.md new file mode 100644 index 00000000000000..6e3e6d2b496779 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_turkish_medium_bert_uncased_pipeline_tr.md @@ -0,0 +1,71 @@ +--- +layout: model +title: Turkish sent_turkish_medium_bert_uncased_pipeline pipeline BertSentenceEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: sent_turkish_medium_bert_uncased_pipeline +date: 2024-12-15 +tags: [tr, open_source, pipeline, onnx] +task: Embeddings +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_turkish_medium_bert_uncased_pipeline` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_turkish_medium_bert_uncased_pipeline_tr_5.5.1_3.0_1734285190442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_turkish_medium_bert_uncased_pipeline_tr_5.5.1_3.0_1734285190442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sent_turkish_medium_bert_uncased_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sent_turkish_medium_bert_uncased_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_turkish_medium_bert_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|157.9 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-medium-bert-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- SentenceDetectorDLModel +- BertSentenceEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sent_turkish_medium_bert_uncased_tr.md b/docs/_posts/ahmedlone127/2024-12-15-sent_turkish_medium_bert_uncased_tr.md new file mode 100644 index 00000000000000..06e233415fe1a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sent_turkish_medium_bert_uncased_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish sent_turkish_medium_bert_uncased BertSentenceEmbeddings from ytu-ce-cosmos +author: John Snow Labs +name: sent_turkish_medium_bert_uncased +date: 2024-12-15 +tags: [tr, open_source, onnx, sentence_embeddings, bert] +task: Embeddings +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertSentenceEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertSentenceEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_turkish_medium_bert_uncased` is a Turkish model originally trained by ytu-ce-cosmos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_turkish_medium_bert_uncased_tr_5.5.1_3.0_1734285182396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_turkish_medium_bert_uncased_tr_5.5.1_3.0_1734285182396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \ + .setInputCols(["document"]) \ + .setOutputCol("sentence") + +embeddings = BertSentenceEmbeddings.pretrained("sent_turkish_medium_bert_uncased","tr") \ + .setInputCols(["sentence"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, sentenceDL, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val embeddings = BertSentenceEmbeddings.pretrained("sent_turkish_medium_bert_uncased","tr") + .setInputCols(Array("sentence")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_turkish_medium_bert_uncased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|157.4 MB| + +## References + +https://huggingface.co/ytu-ce-cosmos/turkish-medium-bert-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_1_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_1_en.md new file mode 100644 index 00000000000000..a0f97386cdfa97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_1_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English sentence_transformer_1 MPNetEmbeddings from DIS-Project +author: John Snow Labs +name: sentence_transformer_1 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformer_1` is a English model originally trained by DIS-Project. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformer_1_en_5.5.1_3.0_1734306697458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformer_1_en_5.5.1_3.0_1734306697458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("sentence_transformer_1","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("sentence_transformer_1","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformer_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|401.7 MB| + +## References + +https://huggingface.co/DIS-Project/Sentence-Transformer_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_1_pipeline_en.md new file mode 100644 index 00000000000000..f0b6ef8942e98e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English sentence_transformer_1_pipeline pipeline MPNetEmbeddings from DIS-Project +author: John Snow Labs +name: sentence_transformer_1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformer_1_pipeline` is a English model originally trained by DIS-Project. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformer_1_pipeline_en_5.5.1_3.0_1734306719959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformer_1_pipeline_en_5.5.1_3.0_1734306719959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentence_transformer_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentence_transformer_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformer_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|401.7 MB| + +## References + +https://huggingface.co/DIS-Project/Sentence-Transformer_1 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_en.md new file mode 100644 index 00000000000000..4b47b8baeb371a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English sentence_transformer MPNetEmbeddings from DIS-Project +author: John Snow Labs +name: sentence_transformer +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformer` is a English model originally trained by DIS-Project. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformer_en_5.5.1_3.0_1734306364070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformer_en_5.5.1_3.0_1734306364070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("sentence_transformer","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("sentence_transformer","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformer| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|401.3 MB| + +## References + +https://huggingface.co/DIS-Project/Sentence-Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_pipeline_en.md new file mode 100644 index 00000000000000..1fa39fcbbbae74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformer_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English sentence_transformer_pipeline pipeline MPNetEmbeddings from DIS-Project +author: John Snow Labs +name: sentence_transformer_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformer_pipeline` is a English model originally trained by DIS-Project. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformer_pipeline_en_5.5.1_3.0_1734306387041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformer_pipeline_en_5.5.1_3.0_1734306387041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentence_transformer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentence_transformer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|401.3 MB| + +## References + +https://huggingface.co/DIS-Project/Sentence-Transformer + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentence_transformers_all_mpnet_base_v2_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformers_all_mpnet_base_v2_en.md new file mode 100644 index 00000000000000..921d957a969eaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformers_all_mpnet_base_v2_en.md @@ -0,0 +1,90 @@ +--- +layout: model +title: English sentence_transformers_all_mpnet_base_v2 MPNetEmbeddings from ai-human-lab +author: John Snow Labs +name: sentence_transformers_all_mpnet_base_v2 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformers_all_mpnet_base_v2` is a English model originally trained by ai-human-lab. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformers_all_mpnet_base_v2_en_5.5.1_3.0_1734306394880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformers_all_mpnet_base_v2_en_5.5.1_3.0_1734306394880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("sentence_transformers_all_mpnet_base_v2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("sentence_transformers_all_mpnet_base_v2","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformers_all_mpnet_base_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +References + +References + +https://huggingface.co/ai-human-lab/sentence-transformers_all-mpnet-base-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentence_transformers_all_mpnet_base_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformers_all_mpnet_base_v2_pipeline_en.md new file mode 100644 index 00000000000000..6fb377c9d7d2ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentence_transformers_all_mpnet_base_v2_pipeline_en.md @@ -0,0 +1,73 @@ +--- +layout: model +title: English sentence_transformers_all_mpnet_base_v2_pipeline pipeline MPNetEmbeddings from ai-human-lab +author: John Snow Labs +name: sentence_transformers_all_mpnet_base_v2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformers_all_mpnet_base_v2_pipeline` is a English model originally trained by ai-human-lab. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformers_all_mpnet_base_v2_pipeline_en_5.5.1_3.0_1734306419895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformers_all_mpnet_base_v2_pipeline_en_5.5.1_3.0_1734306419895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("sentence_transformers_all_mpnet_base_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("sentence_transformers_all_mpnet_base_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformers_all_mpnet_base_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +References + +References + +https://huggingface.co/ai-human-lab/sentence-transformers_all-mpnet-base-v2 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentiment_roberta_restaurant_10_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentiment_roberta_restaurant_10_en.md new file mode 100644 index 00000000000000..796382189c47a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentiment_roberta_restaurant_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sentiment_roberta_restaurant_10 RoBertaForSequenceClassification from pachequinho +author: John Snow Labs +name: sentiment_roberta_restaurant_10 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_roberta_restaurant_10` is a English model originally trained by pachequinho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_roberta_restaurant_10_en_5.5.1_3.0_1734287847952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_roberta_restaurant_10_en_5.5.1_3.0_1734287847952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_roberta_restaurant_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_roberta_restaurant_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_roberta_restaurant_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/pachequinho/sentiment_roberta_restaurant_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-sentiment_roberta_restaurant_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-sentiment_roberta_restaurant_10_pipeline_en.md new file mode 100644 index 00000000000000..b7fc4b13383359 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-sentiment_roberta_restaurant_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sentiment_roberta_restaurant_10_pipeline pipeline RoBertaForSequenceClassification from pachequinho +author: John Snow Labs +name: sentiment_roberta_restaurant_10_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_roberta_restaurant_10_pipeline` is a English model originally trained by pachequinho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_roberta_restaurant_10_pipeline_en_5.5.1_3.0_1734287876928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_roberta_restaurant_10_pipeline_en_5.5.1_3.0_1734287876928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sentiment_roberta_restaurant_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sentiment_roberta_restaurant_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_roberta_restaurant_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/pachequinho/sentiment_roberta_restaurant_10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-setfit_mbti_multiclass_w266_en.md b/docs/_posts/ahmedlone127/2024-12-15-setfit_mbti_multiclass_w266_en.md new file mode 100644 index 00000000000000..bbd59b0b9c4776 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-setfit_mbti_multiclass_w266_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English setfit_mbti_multiclass_w266 MPNetEmbeddings from shrinivasbjoshi +author: John Snow Labs +name: setfit_mbti_multiclass_w266 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`setfit_mbti_multiclass_w266` is a English model originally trained by shrinivasbjoshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/setfit_mbti_multiclass_w266_en_5.5.1_3.0_1734306819542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/setfit_mbti_multiclass_w266_en_5.5.1_3.0_1734306819542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("setfit_mbti_multiclass_w266","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("setfit_mbti_multiclass_w266","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|setfit_mbti_multiclass_w266| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/shrinivasbjoshi/setfit-mbti-multiclass-w266 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-setfit_mbti_multiclass_w266_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-setfit_mbti_multiclass_w266_pipeline_en.md new file mode 100644 index 00000000000000..1ed312bfb01a6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-setfit_mbti_multiclass_w266_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English setfit_mbti_multiclass_w266_pipeline pipeline MPNetEmbeddings from shrinivasbjoshi +author: John Snow Labs +name: setfit_mbti_multiclass_w266_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`setfit_mbti_multiclass_w266_pipeline` is a English model originally trained by shrinivasbjoshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/setfit_mbti_multiclass_w266_pipeline_en_5.5.1_3.0_1734306840553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/setfit_mbti_multiclass_w266_pipeline_en_5.5.1_3.0_1734306840553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("setfit_mbti_multiclass_w266_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("setfit_mbti_multiclass_w266_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|setfit_mbti_multiclass_w266_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/shrinivasbjoshi/setfit-mbti-multiclass-w266 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-setfit_model_calgary_epochs2_en.md b/docs/_posts/ahmedlone127/2024-12-15-setfit_model_calgary_epochs2_en.md new file mode 100644 index 00000000000000..c84f885f37ce17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-setfit_model_calgary_epochs2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English setfit_model_calgary_epochs2 MPNetEmbeddings from mitra-mir +author: John Snow Labs +name: setfit_model_calgary_epochs2 +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`setfit_model_calgary_epochs2` is a English model originally trained by mitra-mir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/setfit_model_calgary_epochs2_en_5.5.1_3.0_1734306515480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/setfit_model_calgary_epochs2_en_5.5.1_3.0_1734306515480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("setfit_model_calgary_epochs2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("setfit_model_calgary_epochs2","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|setfit_model_calgary_epochs2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/mitra-mir/setfit_model_Calgary_epochs2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-setfit_model_calgary_epochs2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-setfit_model_calgary_epochs2_pipeline_en.md new file mode 100644 index 00000000000000..384e5c11c6b4ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-setfit_model_calgary_epochs2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English setfit_model_calgary_epochs2_pipeline pipeline MPNetEmbeddings from mitra-mir +author: John Snow Labs +name: setfit_model_calgary_epochs2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`setfit_model_calgary_epochs2_pipeline` is a English model originally trained by mitra-mir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/setfit_model_calgary_epochs2_pipeline_en_5.5.1_3.0_1734306536843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/setfit_model_calgary_epochs2_pipeline_en_5.5.1_3.0_1734306536843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("setfit_model_calgary_epochs2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("setfit_model_calgary_epochs2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|setfit_model_calgary_epochs2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/mitra-mir/setfit_model_Calgary_epochs2 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-setfit_product_review_en.md b/docs/_posts/ahmedlone127/2024-12-15-setfit_product_review_en.md new file mode 100644 index 00000000000000..2aa342bf77e058 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-setfit_product_review_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English setfit_product_review MPNetEmbeddings from ivanzidov +author: John Snow Labs +name: setfit_product_review +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`setfit_product_review` is a English model originally trained by ivanzidov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/setfit_product_review_en_5.5.1_3.0_1734306665394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/setfit_product_review_en_5.5.1_3.0_1734306665394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("setfit_product_review","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("setfit_product_review","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|setfit_product_review| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/ivanzidov/setfit-product-review \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-setfit_product_review_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-setfit_product_review_pipeline_en.md new file mode 100644 index 00000000000000..8c5e06a4cf66de --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-setfit_product_review_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English setfit_product_review_pipeline pipeline MPNetEmbeddings from ivanzidov +author: John Snow Labs +name: setfit_product_review_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`setfit_product_review_pipeline` is a English model originally trained by ivanzidov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/setfit_product_review_pipeline_en_5.5.1_3.0_1734306686092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/setfit_product_review_pipeline_en_5.5.1_3.0_1734306686092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("setfit_product_review_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("setfit_product_review_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|setfit_product_review_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/ivanzidov/setfit-product-review + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-somd_xlm_stage2_v2_en.md b/docs/_posts/ahmedlone127/2024-12-15-somd_xlm_stage2_v2_en.md new file mode 100644 index 00000000000000..81199fff3947ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-somd_xlm_stage2_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English somd_xlm_stage2_v2 XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: somd_xlm_stage2_v2 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_xlm_stage2_v2` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_xlm_stage2_v2_en_5.5.1_3.0_1734291347058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_xlm_stage2_v2_en_5.5.1_3.0_1734291347058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("somd_xlm_stage2_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("somd_xlm_stage2_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_xlm_stage2_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|782.0 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-xlm-stage2-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-somd_xlm_stage2_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-somd_xlm_stage2_v2_pipeline_en.md new file mode 100644 index 00000000000000..c2d0454e6865d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-somd_xlm_stage2_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English somd_xlm_stage2_v2_pipeline pipeline XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: somd_xlm_stage2_v2_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_xlm_stage2_v2_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_xlm_stage2_v2_pipeline_en_5.5.1_3.0_1734291482683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_xlm_stage2_v2_pipeline_en_5.5.1_3.0_1734291482683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("somd_xlm_stage2_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("somd_xlm_stage2_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_xlm_stage2_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|782.0 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-xlm-stage2-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-speech_latex3_en.md b/docs/_posts/ahmedlone127/2024-12-15-speech_latex3_en.md new file mode 100644 index 00000000000000..a67d144a11ff15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-speech_latex3_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English speech_latex3 T5Transformer from vinalal +author: John Snow Labs +name: speech_latex3 +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`speech_latex3` is a English model originally trained by vinalal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/speech_latex3_en_5.5.1_3.0_1734299748681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/speech_latex3_en_5.5.1_3.0_1734299748681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("speech_latex3","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("speech_latex3", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|speech_latex3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|317.6 MB| + +## References + +https://huggingface.co/vinalal/speech-latex3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-speech_latex3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-speech_latex3_pipeline_en.md new file mode 100644 index 00000000000000..db41c08f3a0bb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-speech_latex3_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English speech_latex3_pipeline pipeline T5Transformer from vinalal +author: John Snow Labs +name: speech_latex3_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`speech_latex3_pipeline` is a English model originally trained by vinalal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/speech_latex3_pipeline_en_5.5.1_3.0_1734299772551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/speech_latex3_pipeline_en_5.5.1_3.0_1734299772551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("speech_latex3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("speech_latex3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|speech_latex3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|317.6 MB| + +## References + +https://huggingface.co/vinalal/speech-latex3 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-squad_bert_model_en.md b/docs/_posts/ahmedlone127/2024-12-15-squad_bert_model_en.md new file mode 100644 index 00000000000000..a4f862b72db79b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-squad_bert_model_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English squad_bert_model BertForQuestionAnswering from Drashtip +author: John Snow Labs +name: squad_bert_model +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`squad_bert_model` is a English model originally trained by Drashtip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/squad_bert_model_en_5.5.1_3.0_1734296670532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/squad_bert_model_en_5.5.1_3.0_1734296670532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("squad_bert_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("squad_bert_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|squad_bert_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Drashtip/Squad_Bert_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-squad_bert_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-squad_bert_model_pipeline_en.md new file mode 100644 index 00000000000000..6a3bbb3c1e3741 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-squad_bert_model_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English squad_bert_model_pipeline pipeline BertForQuestionAnswering from Drashtip +author: John Snow Labs +name: squad_bert_model_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`squad_bert_model_pipeline` is a English model originally trained by Drashtip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/squad_bert_model_pipeline_en_5.5.1_3.0_1734296691514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/squad_bert_model_pipeline_en_5.5.1_3.0_1734296691514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("squad_bert_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("squad_bert_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|squad_bert_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Drashtip/Squad_Bert_Model + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-stance_twitter_xlm_target_oblivious_en.md b/docs/_posts/ahmedlone127/2024-12-15-stance_twitter_xlm_target_oblivious_en.md new file mode 100644 index 00000000000000..03edbfb8f5bdcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-stance_twitter_xlm_target_oblivious_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English stance_twitter_xlm_target_oblivious XlmRoBertaForSequenceClassification from GateNLP +author: John Snow Labs +name: stance_twitter_xlm_target_oblivious +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stance_twitter_xlm_target_oblivious` is a English model originally trained by GateNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stance_twitter_xlm_target_oblivious_en_5.5.1_3.0_1734292220074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stance_twitter_xlm_target_oblivious_en_5.5.1_3.0_1734292220074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("stance_twitter_xlm_target_oblivious","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("stance_twitter_xlm_target_oblivious", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stance_twitter_xlm_target_oblivious| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/GateNLP/stance-twitter-xlm-target-oblivious \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-stance_twitter_xlm_target_oblivious_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-stance_twitter_xlm_target_oblivious_pipeline_en.md new file mode 100644 index 00000000000000..e9767104272a00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-stance_twitter_xlm_target_oblivious_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English stance_twitter_xlm_target_oblivious_pipeline pipeline XlmRoBertaForSequenceClassification from GateNLP +author: John Snow Labs +name: stance_twitter_xlm_target_oblivious_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stance_twitter_xlm_target_oblivious_pipeline` is a English model originally trained by GateNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stance_twitter_xlm_target_oblivious_pipeline_en_5.5.1_3.0_1734292276626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stance_twitter_xlm_target_oblivious_pipeline_en_5.5.1_3.0_1734292276626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("stance_twitter_xlm_target_oblivious_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("stance_twitter_xlm_target_oblivious_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stance_twitter_xlm_target_oblivious_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/GateNLP/stance-twitter-xlm-target-oblivious + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_base_grammar_corrector_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_base_grammar_corrector_en.md new file mode 100644 index 00000000000000..a2ea0cbfc80724 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_base_grammar_corrector_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_base_grammar_corrector T5Transformer from akhmat-s +author: John Snow Labs +name: t5_base_grammar_corrector +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_base_grammar_corrector` is a English model originally trained by akhmat-s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_base_grammar_corrector_en_5.5.1_3.0_1734302042080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_base_grammar_corrector_en_5.5.1_3.0_1734302042080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_base_grammar_corrector","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_base_grammar_corrector", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_base_grammar_corrector| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/akhmat-s/t5-base-grammar-corrector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_base_grammar_corrector_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_base_grammar_corrector_pipeline_en.md new file mode 100644 index 00000000000000..1343d823092beb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_base_grammar_corrector_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_base_grammar_corrector_pipeline pipeline T5Transformer from akhmat-s +author: John Snow Labs +name: t5_base_grammar_corrector_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_base_grammar_corrector_pipeline` is a English model originally trained by akhmat-s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_base_grammar_corrector_pipeline_en_5.5.1_3.0_1734302092871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_base_grammar_corrector_pipeline_en_5.5.1_3.0_1734302092871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_base_grammar_corrector_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_base_grammar_corrector_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_base_grammar_corrector_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/akhmat-s/t5-base-grammar-corrector + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_chatbot_customersupport_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_chatbot_customersupport_en.md new file mode 100644 index 00000000000000..41f4ba2e8ed1f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_chatbot_customersupport_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_chatbot_customersupport T5Transformer from sunbv56 +author: John Snow Labs +name: t5_chatbot_customersupport +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_chatbot_customersupport` is a English model originally trained by sunbv56. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_chatbot_customersupport_en_5.5.1_3.0_1734299081388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_chatbot_customersupport_en_5.5.1_3.0_1734299081388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_chatbot_customersupport","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_chatbot_customersupport", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_chatbot_customersupport| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|998.3 MB| + +## References + +https://huggingface.co/sunbv56/T5_Chatbot_CustomerSupport \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_chatbot_customersupport_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_chatbot_customersupport_pipeline_en.md new file mode 100644 index 00000000000000..ab9450d52a41a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_chatbot_customersupport_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_chatbot_customersupport_pipeline pipeline T5Transformer from sunbv56 +author: John Snow Labs +name: t5_chatbot_customersupport_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_chatbot_customersupport_pipeline` is a English model originally trained by sunbv56. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_chatbot_customersupport_pipeline_en_5.5.1_3.0_1734299135132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_chatbot_customersupport_pipeline_en_5.5.1_3.0_1734299135132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_chatbot_customersupport_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_chatbot_customersupport_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_chatbot_customersupport_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|998.3 MB| + +## References + +https://huggingface.co/sunbv56/T5_Chatbot_CustomerSupport + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_keyword_generation_finetuned_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_keyword_generation_finetuned_en.md new file mode 100644 index 00000000000000..5b0e6107a81c48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_keyword_generation_finetuned_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_keyword_generation_finetuned T5Transformer from dgobran +author: John Snow Labs +name: t5_keyword_generation_finetuned +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_keyword_generation_finetuned` is a English model originally trained by dgobran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_keyword_generation_finetuned_en_5.5.1_3.0_1734300250773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_keyword_generation_finetuned_en_5.5.1_3.0_1734300250773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_keyword_generation_finetuned","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_keyword_generation_finetuned", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_keyword_generation_finetuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|337.8 MB| + +## References + +https://huggingface.co/dgobran/t5-keyword-generation-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_keyword_generation_finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_keyword_generation_finetuned_pipeline_en.md new file mode 100644 index 00000000000000..0272c44c5970b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_keyword_generation_finetuned_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_keyword_generation_finetuned_pipeline pipeline T5Transformer from dgobran +author: John Snow Labs +name: t5_keyword_generation_finetuned_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_keyword_generation_finetuned_pipeline` is a English model originally trained by dgobran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_keyword_generation_finetuned_pipeline_en_5.5.1_3.0_1734300269506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_keyword_generation_finetuned_pipeline_en_5.5.1_3.0_1734300269506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_keyword_generation_finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_keyword_generation_finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_keyword_generation_finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|337.8 MB| + +## References + +https://huggingface.co/dgobran/t5-keyword-generation-finetuned + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_portuguese_quiz_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_portuguese_quiz_en.md new file mode 100644 index 00000000000000..b1121a08a8abf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_portuguese_quiz_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_portuguese_quiz T5Transformer from yoshidevs +author: John Snow Labs +name: t5_portuguese_quiz +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_portuguese_quiz` is a English model originally trained by yoshidevs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_portuguese_quiz_en_5.5.1_3.0_1734302073863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_portuguese_quiz_en_5.5.1_3.0_1734302073863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_portuguese_quiz","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_portuguese_quiz", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_portuguese_quiz| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|847.1 MB| + +## References + +https://huggingface.co/yoshidevs/t5-portuguese-quiz \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_portuguese_quiz_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_portuguese_quiz_pipeline_en.md new file mode 100644 index 00000000000000..5f1e86cd80c2ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_portuguese_quiz_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_portuguese_quiz_pipeline pipeline T5Transformer from yoshidevs +author: John Snow Labs +name: t5_portuguese_quiz_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_portuguese_quiz_pipeline` is a English model originally trained by yoshidevs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_portuguese_quiz_pipeline_en_5.5.1_3.0_1734302154584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_portuguese_quiz_pipeline_en_5.5.1_3.0_1734302154584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_portuguese_quiz_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_portuguese_quiz_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_portuguese_quiz_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|847.1 MB| + +## References + +https://huggingface.co/yoshidevs/t5-portuguese-quiz + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_small_sum_dpo_177k_32_1ep_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_small_sum_dpo_177k_32_1ep_en.md new file mode 100644 index 00000000000000..fdff3b58a8687e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_small_sum_dpo_177k_32_1ep_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_sum_dpo_177k_32_1ep T5Transformer from Muadil +author: John Snow Labs +name: t5_small_sum_dpo_177k_32_1ep +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_sum_dpo_177k_32_1ep` is a English model originally trained by Muadil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_sum_dpo_177k_32_1ep_en_5.5.1_3.0_1734301637402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_sum_dpo_177k_32_1ep_en_5.5.1_3.0_1734301637402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_sum_dpo_177k_32_1ep","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_sum_dpo_177k_32_1ep", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_sum_dpo_177k_32_1ep| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|187.3 MB| + +## References + +https://huggingface.co/Muadil/t5-small_sum_DPO_177k_32_1ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_small_sum_dpo_177k_32_1ep_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_small_sum_dpo_177k_32_1ep_pipeline_en.md new file mode 100644 index 00000000000000..7a2880ab1b3eec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_small_sum_dpo_177k_32_1ep_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_sum_dpo_177k_32_1ep_pipeline pipeline T5Transformer from Muadil +author: John Snow Labs +name: t5_small_sum_dpo_177k_32_1ep_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_sum_dpo_177k_32_1ep_pipeline` is a English model originally trained by Muadil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_sum_dpo_177k_32_1ep_pipeline_en_5.5.1_3.0_1734301695030.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_sum_dpo_177k_32_1ep_pipeline_en_5.5.1_3.0_1734301695030.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_sum_dpo_177k_32_1ep_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_sum_dpo_177k_32_1ep_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_sum_dpo_177k_32_1ep_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|187.3 MB| + +## References + +https://huggingface.co/Muadil/t5-small_sum_DPO_177k_32_1ep + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_tiny_random_michaelbenayoun_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_tiny_random_michaelbenayoun_en.md new file mode 100644 index 00000000000000..99d21d998c146b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_tiny_random_michaelbenayoun_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_tiny_random_michaelbenayoun T5Transformer from michaelbenayoun +author: John Snow Labs +name: t5_tiny_random_michaelbenayoun +date: 2024-12-15 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_tiny_random_michaelbenayoun` is a English model originally trained by michaelbenayoun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_tiny_random_michaelbenayoun_en_5.5.1_3.0_1734301157700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_tiny_random_michaelbenayoun_en_5.5.1_3.0_1734301157700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_tiny_random_michaelbenayoun","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_tiny_random_michaelbenayoun", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_tiny_random_michaelbenayoun| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|26.9 MB| + +## References + +https://huggingface.co/michaelbenayoun/t5-tiny-random \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-t5_tiny_random_michaelbenayoun_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-t5_tiny_random_michaelbenayoun_pipeline_en.md new file mode 100644 index 00000000000000..e6e55c1fbcf399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-t5_tiny_random_michaelbenayoun_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_tiny_random_michaelbenayoun_pipeline pipeline T5Transformer from michaelbenayoun +author: John Snow Labs +name: t5_tiny_random_michaelbenayoun_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_tiny_random_michaelbenayoun_pipeline` is a English model originally trained by michaelbenayoun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_tiny_random_michaelbenayoun_pipeline_en_5.5.1_3.0_1734301159309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_tiny_random_michaelbenayoun_pipeline_en_5.5.1_3.0_1734301159309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_tiny_random_michaelbenayoun_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_tiny_random_michaelbenayoun_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_tiny_random_michaelbenayoun_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|26.9 MB| + +## References + +https://huggingface.co/michaelbenayoun/t5-tiny-random + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-temp_model_output_dir_en.md b/docs/_posts/ahmedlone127/2024-12-15-temp_model_output_dir_en.md new file mode 100644 index 00000000000000..257db1f8fab242 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-temp_model_output_dir_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English temp_model_output_dir RoBertaForSequenceClassification from C-Stuti +author: John Snow Labs +name: temp_model_output_dir +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`temp_model_output_dir` is a English model originally trained by C-Stuti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/temp_model_output_dir_en_5.5.1_3.0_1734287425484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/temp_model_output_dir_en_5.5.1_3.0_1734287425484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("temp_model_output_dir","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("temp_model_output_dir", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|temp_model_output_dir| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.9 MB| + +## References + +https://huggingface.co/C-Stuti/temp_model_output_dir \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-temp_model_output_dir_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-temp_model_output_dir_pipeline_en.md new file mode 100644 index 00000000000000..ba5364f309d5be --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-temp_model_output_dir_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English temp_model_output_dir_pipeline pipeline RoBertaForSequenceClassification from C-Stuti +author: John Snow Labs +name: temp_model_output_dir_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`temp_model_output_dir_pipeline` is a English model originally trained by C-Stuti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/temp_model_output_dir_pipeline_en_5.5.1_3.0_1734287452077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/temp_model_output_dir_pipeline_en_5.5.1_3.0_1734287452077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("temp_model_output_dir_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("temp_model_output_dir_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|temp_model_output_dir_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|455.9 MB| + +## References + +https://huggingface.co/C-Stuti/temp_model_output_dir + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-test_model_en.md b/docs/_posts/ahmedlone127/2024-12-15-test_model_en.md new file mode 100644 index 00000000000000..7e607bfbed5158 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-test_model_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English test_model MPNetEmbeddings from Linco +author: John Snow Labs +name: test_model +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_model` is a English model originally trained by Linco. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_model_en_5.5.1_3.0_1734293470450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_model_en_5.5.1_3.0_1734293470450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("test_model","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("test_model","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|812.8 MB| + +## References + +References + +https://huggingface.co/Linco/test-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-test_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-test_model_pipeline_en.md new file mode 100644 index 00000000000000..554bc0f92ad84e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-test_model_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English test_model_pipeline pipeline MPNetEmbeddings from Linco +author: John Snow Labs +name: test_model_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_model_pipeline` is a English model originally trained by Linco. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_model_pipeline_en_5.5.1_3.0_1734293591288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_model_pipeline_en_5.5.1_3.0_1734293591288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("test_model_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("test_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|812.8 MB| + +## References + +References + +https://huggingface.co/Linco/test-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-tiny_random_debertav2forquestionanswering_en.md b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_debertav2forquestionanswering_en.md new file mode 100644 index 00000000000000..9de6d65c23702b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_debertav2forquestionanswering_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English tiny_random_debertav2forquestionanswering BertForQuestionAnswering from ydshieh +author: John Snow Labs +name: tiny_random_debertav2forquestionanswering +date: 2024-12-15 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_debertav2forquestionanswering` is a English model originally trained by ydshieh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_debertav2forquestionanswering_en_5.5.1_3.0_1734297045975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_debertav2forquestionanswering_en_5.5.1_3.0_1734297045975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("tiny_random_debertav2forquestionanswering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("tiny_random_debertav2forquestionanswering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_debertav2forquestionanswering| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|349.2 KB| + +## References + +https://huggingface.co/ydshieh/tiny-random-DebertaV2ForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-tiny_random_debertav2forquestionanswering_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_debertav2forquestionanswering_pipeline_en.md new file mode 100644 index 00000000000000..438ee738d9e561 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_debertav2forquestionanswering_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English tiny_random_debertav2forquestionanswering_pipeline pipeline BertForQuestionAnswering from ydshieh +author: John Snow Labs +name: tiny_random_debertav2forquestionanswering_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_debertav2forquestionanswering_pipeline` is a English model originally trained by ydshieh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_debertav2forquestionanswering_pipeline_en_5.5.1_3.0_1734297046590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_debertav2forquestionanswering_pipeline_en_5.5.1_3.0_1734297046590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_random_debertav2forquestionanswering_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_random_debertav2forquestionanswering_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_debertav2forquestionanswering_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|355.8 KB| + +## References + +https://huggingface.co/ydshieh/tiny-random-DebertaV2ForQuestionAnswering + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-tiny_random_mpnetfortokenclassification_en.md b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_mpnetfortokenclassification_en.md new file mode 100644 index 00000000000000..e18a30dec8192c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_mpnetfortokenclassification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tiny_random_mpnetfortokenclassification MPNetForTokenClassification from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_mpnetfortokenclassification +date: 2024-12-15 +tags: [en, open_source, onnx, token_classification, mpnet, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_mpnetfortokenclassification` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_mpnetfortokenclassification_en_5.5.1_3.0_1734290639780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_mpnetfortokenclassification_en_5.5.1_3.0_1734290639780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = MPNetForTokenClassification.pretrained("tiny_random_mpnetfortokenclassification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = MPNetForTokenClassification.pretrained("tiny_random_mpnetfortokenclassification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_mpnetfortokenclassification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|889.4 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-MPNetForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-tiny_random_mpnetfortokenclassification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_mpnetfortokenclassification_pipeline_en.md new file mode 100644 index 00000000000000..56f1db10067652 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-tiny_random_mpnetfortokenclassification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tiny_random_mpnetfortokenclassification_pipeline pipeline MPNetForTokenClassification from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_mpnetfortokenclassification_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_mpnetfortokenclassification_pipeline` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_mpnetfortokenclassification_pipeline_en_5.5.1_3.0_1734290640187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_mpnetfortokenclassification_pipeline_en_5.5.1_3.0_1734290640187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tiny_random_mpnetfortokenclassification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tiny_random_mpnetfortokenclassification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_mpnetfortokenclassification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|911.4 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-MPNetForTokenClassification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- MPNetForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-tinybert_train_en.md b/docs/_posts/ahmedlone127/2024-12-15-tinybert_train_en.md new file mode 100644 index 00000000000000..16332523f2a279 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-tinybert_train_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tinybert_train DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: tinybert_train +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tinybert_train` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tinybert_train_en_5.5.1_3.0_1734289303963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tinybert_train_en_5.5.1_3.0_1734289303963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("tinybert_train","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("tinybert_train","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tinybert_train| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|122.9 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/tinybert_train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-tinybert_train_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-tinybert_train_pipeline_en.md new file mode 100644 index 00000000000000..c761b56f88a78d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-tinybert_train_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tinybert_train_pipeline pipeline DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: tinybert_train_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tinybert_train_pipeline` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tinybert_train_pipeline_en_5.5.1_3.0_1734289309638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tinybert_train_pipeline_en_5.5.1_3.0_1734289309638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tinybert_train_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tinybert_train_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tinybert_train_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|122.9 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/tinybert_train + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-toxic_retriever_en.md b/docs/_posts/ahmedlone127/2024-12-15-toxic_retriever_en.md new file mode 100644 index 00000000000000..a51f890fdeeb30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-toxic_retriever_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English toxic_retriever MPNetEmbeddings from thiemcun203 +author: John Snow Labs +name: toxic_retriever +date: 2024-12-15 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_retriever` is a English model originally trained by thiemcun203. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_retriever_en_5.5.1_3.0_1734306166105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_retriever_en_5.5.1_3.0_1734306166105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("toxic_retriever","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("toxic_retriever","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_retriever| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/thiemcun203/Toxic-Retriever \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-toxic_retriever_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-toxic_retriever_pipeline_en.md new file mode 100644 index 00000000000000..4d7d08d2c9e3a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-toxic_retriever_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English toxic_retriever_pipeline pipeline MPNetEmbeddings from thiemcun203 +author: John Snow Labs +name: toxic_retriever_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_retriever_pipeline` is a English model originally trained by thiemcun203. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_retriever_pipeline_en_5.5.1_3.0_1734306187143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_retriever_pipeline_en_5.5.1_3.0_1734306187143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxic_retriever_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxic_retriever_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_retriever_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/thiemcun203/Toxic-Retriever + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-toxicity_token_classifier_en.md b/docs/_posts/ahmedlone127/2024-12-15-toxicity_token_classifier_en.md new file mode 100644 index 00000000000000..0392b0b3ef298e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-toxicity_token_classifier_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English toxicity_token_classifier MPNetForTokenClassification from Sinanmz +author: John Snow Labs +name: toxicity_token_classifier +date: 2024-12-15 +tags: [en, open_source, onnx, token_classification, mpnet, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicity_token_classifier` is a English model originally trained by Sinanmz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicity_token_classifier_en_5.5.1_3.0_1734290747624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicity_token_classifier_en_5.5.1_3.0_1734290747624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = MPNetForTokenClassification.pretrained("toxicity_token_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = MPNetForTokenClassification.pretrained("toxicity_token_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicity_token_classifier| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|290.2 MB| + +## References + +https://huggingface.co/Sinanmz/toxicity_token_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-toxicity_token_classifier_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-toxicity_token_classifier_pipeline_en.md new file mode 100644 index 00000000000000..bd2857bd506089 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-toxicity_token_classifier_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English toxicity_token_classifier_pipeline pipeline MPNetForTokenClassification from Sinanmz +author: John Snow Labs +name: toxicity_token_classifier_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicity_token_classifier_pipeline` is a English model originally trained by Sinanmz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicity_token_classifier_pipeline_en_5.5.1_3.0_1734290809123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicity_token_classifier_pipeline_en_5.5.1_3.0_1734290809123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("toxicity_token_classifier_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("toxicity_token_classifier_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicity_token_classifier_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|290.2 MB| + +## References + +https://huggingface.co/Sinanmz/toxicity_token_classifier + +## Included Models + +- DocumentAssembler +- TokenizerModel +- MPNetForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-twitter_roberta_base_sensitive_binary_en.md b/docs/_posts/ahmedlone127/2024-12-15-twitter_roberta_base_sensitive_binary_en.md new file mode 100644 index 00000000000000..df5f49aa4a706a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-twitter_roberta_base_sensitive_binary_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English twitter_roberta_base_sensitive_binary RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_sensitive_binary +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sensitive_binary` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sensitive_binary_en_5.5.1_3.0_1734287599231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sensitive_binary_en_5.5.1_3.0_1734287599231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sensitive_binary","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sensitive_binary", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sensitive_binary| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-sensitive-binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-twitter_roberta_base_sensitive_binary_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-twitter_roberta_base_sensitive_binary_pipeline_en.md new file mode 100644 index 00000000000000..6cfa734392c506 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-twitter_roberta_base_sensitive_binary_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English twitter_roberta_base_sensitive_binary_pipeline pipeline RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_sensitive_binary_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sensitive_binary_pipeline` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sensitive_binary_pipeline_en_5.5.1_3.0_1734287630407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sensitive_binary_pipeline_en_5.5.1_3.0_1734287630407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("twitter_roberta_base_sensitive_binary_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("twitter_roberta_base_sensitive_binary_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sensitive_binary_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|468.4 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-sensitive-binary + +## Included Models + +- DocumentAssembler +- TokenizerModel +- RoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_quotes_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_quotes_en.md new file mode 100644 index 00000000000000..81420fd40d7169 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_quotes_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_quotes XlmRoBertaForSequenceClassification from burakgo +author: John Snow Labs +name: xlm_quotes +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_quotes` is a English model originally trained by burakgo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_quotes_en_5.5.1_3.0_1734293464873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_quotes_en_5.5.1_3.0_1734293464873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_quotes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_quotes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_quotes| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|793.1 MB| + +## References + +https://huggingface.co/burakgo/xlm_quotes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_quotes_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_quotes_pipeline_en.md new file mode 100644 index 00000000000000..6d069c81bdb33c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_quotes_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_quotes_pipeline pipeline XlmRoBertaForSequenceClassification from burakgo +author: John Snow Labs +name: xlm_quotes_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_quotes_pipeline` is a English model originally trained by burakgo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_quotes_pipeline_en_5.5.1_3.0_1734293586475.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_quotes_pipeline_en_5.5.1_3.0_1734293586475.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_quotes_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_quotes_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_quotes_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|793.2 MB| + +## References + +https://huggingface.co/burakgo/xlm_quotes + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_10_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_10_en.md new file mode 100644 index 00000000000000..a7c0a6549514d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_10 XlmRoBertaForSequenceClassification from alyazharr +author: John Snow Labs +name: xlm_roberta_base_10 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_10` is a English model originally trained by alyazharr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_10_en_5.5.1_3.0_1734292483249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_10_en_5.5.1_3.0_1734292483249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|827.6 MB| + +## References + +https://huggingface.co/alyazharr/xlm_roberta_base_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_10_pipeline_en.md new file mode 100644 index 00000000000000..503fbcc31800e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_10_pipeline pipeline XlmRoBertaForSequenceClassification from alyazharr +author: John Snow Labs +name: xlm_roberta_base_10_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_10_pipeline` is a English model originally trained by alyazharr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_10_pipeline_en_5.5.1_3.0_1734292572211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_10_pipeline_en_5.5.1_3.0_1734292572211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|827.6 MB| + +## References + +https://huggingface.co/alyazharr/xlm_roberta_base_10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_delete_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_delete_en.md new file mode 100644 index 00000000000000..0184f1ebd4326d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_delete_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_balance_mixed_aug_delete XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_mixed_aug_delete +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_mixed_aug_delete` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_delete_en_5.5.1_3.0_1734291250003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_delete_en_5.5.1_3.0_1734291250003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_mixed_aug_delete","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_mixed_aug_delete", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_mixed_aug_delete| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|794.3 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_Mixed-aug_delete \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_delete_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_delete_pipeline_en.md new file mode 100644 index 00000000000000..dba9e830348de9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_delete_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_balance_mixed_aug_delete_pipeline pipeline XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_mixed_aug_delete_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_mixed_aug_delete_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_delete_pipeline_en_5.5.1_3.0_1734291376667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_delete_pipeline_en_5.5.1_3.0_1734291376667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_balance_mixed_aug_delete_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_balance_mixed_aug_delete_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_mixed_aug_delete_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|794.3 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_Mixed-aug_delete + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_replace_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_replace_en.md new file mode 100644 index 00000000000000..eba0abb2b739ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_replace_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_balance_mixed_aug_replace XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_mixed_aug_replace +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_mixed_aug_replace` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_replace_en_5.5.1_3.0_1734292073411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_replace_en_5.5.1_3.0_1734292073411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_mixed_aug_replace","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_mixed_aug_replace", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_mixed_aug_replace| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|796.2 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_Mixed-aug_replace \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_replace_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_replace_pipeline_en.md new file mode 100644 index 00000000000000..57fea989c5f3a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_mixed_aug_replace_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_balance_mixed_aug_replace_pipeline pipeline XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_mixed_aug_replace_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_mixed_aug_replace_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_replace_pipeline_en_5.5.1_3.0_1734292193958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_mixed_aug_replace_pipeline_en_5.5.1_3.0_1734292193958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_balance_mixed_aug_replace_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_balance_mixed_aug_replace_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_mixed_aug_replace_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|796.3 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_Mixed-aug_replace + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_insert_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_insert_en.md new file mode 100644 index 00000000000000..04cd6ab53c7eee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_insert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_balance_vietnam_aug_insert XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_vietnam_aug_insert +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_vietnam_aug_insert` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_insert_en_5.5.1_3.0_1734294199493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_insert_en_5.5.1_3.0_1734294199493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_vietnam_aug_insert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_vietnam_aug_insert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_vietnam_aug_insert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|795.5 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_VietNam-aug_insert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_insert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_insert_pipeline_en.md new file mode 100644 index 00000000000000..b0c15bf9dcf8fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_insert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_balance_vietnam_aug_insert_pipeline pipeline XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_vietnam_aug_insert_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_vietnam_aug_insert_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_insert_pipeline_en_5.5.1_3.0_1734294320212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_insert_pipeline_en_5.5.1_3.0_1734294320212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_balance_vietnam_aug_insert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_balance_vietnam_aug_insert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_vietnam_aug_insert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|795.5 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_VietNam-aug_insert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_replace_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_replace_en.md new file mode 100644 index 00000000000000..ec3af392ff1d7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_replace_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_balance_vietnam_aug_replace XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_vietnam_aug_replace +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_vietnam_aug_replace` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_replace_en_5.5.1_3.0_1734292927465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_replace_en_5.5.1_3.0_1734292927465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_vietnam_aug_replace","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_balance_vietnam_aug_replace", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_vietnam_aug_replace| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|794.9 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_VietNam-aug_replace \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_replace_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_replace_pipeline_en.md new file mode 100644 index 00000000000000..f884881c135e79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_balance_vietnam_aug_replace_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_balance_vietnam_aug_replace_pipeline pipeline XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_balance_vietnam_aug_replace_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_balance_vietnam_aug_replace_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_replace_pipeline_en_5.5.1_3.0_1734293054350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_balance_vietnam_aug_replace_pipeline_en_5.5.1_3.0_1734293054350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_balance_vietnam_aug_replace_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_balance_vietnam_aug_replace_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_balance_vietnam_aug_replace_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|794.9 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-Balance_VietNam-aug_replace + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_detests_wandb24_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_detests_wandb24_en.md new file mode 100644 index 00000000000000..7c7f14ffde473c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_detests_wandb24_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_detests_wandb24 XlmRoBertaForSequenceClassification from Pablo94 +author: John Snow Labs +name: xlm_roberta_base_finetuned_detests_wandb24 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_detests_wandb24` is a English model originally trained by Pablo94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_detests_wandb24_en_5.5.1_3.0_1734292296505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_detests_wandb24_en_5.5.1_3.0_1734292296505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_finetuned_detests_wandb24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_finetuned_detests_wandb24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_detests_wandb24| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|823.1 MB| + +## References + +https://huggingface.co/Pablo94/xlm-roberta-base-finetuned-detests-wandb24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_detests_wandb24_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_detests_wandb24_pipeline_en.md new file mode 100644 index 00000000000000..ec411f1b770f34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_detests_wandb24_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_detests_wandb24_pipeline pipeline XlmRoBertaForSequenceClassification from Pablo94 +author: John Snow Labs +name: xlm_roberta_base_finetuned_detests_wandb24_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_detests_wandb24_pipeline` is a English model originally trained by Pablo94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_detests_wandb24_pipeline_en_5.5.1_3.0_1734292394439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_detests_wandb24_pipeline_en_5.5.1_3.0_1734292394439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_detests_wandb24_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_detests_wandb24_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_detests_wandb24_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|823.1 MB| + +## References + +https://huggingface.co/Pablo94/xlm-roberta-base-finetuned-detests-wandb24 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_marc_english_fancorparation_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_marc_english_fancorparation_en.md new file mode 100644 index 00000000000000..16b3d5165df0eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_marc_english_fancorparation_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_marc_english_fancorparation XlmRoBertaForSequenceClassification from Fancorparation +author: John Snow Labs +name: xlm_roberta_base_finetuned_marc_english_fancorparation +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_marc_english_fancorparation` is a English model originally trained by Fancorparation. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_marc_english_fancorparation_en_5.5.1_3.0_1734291713697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_marc_english_fancorparation_en_5.5.1_3.0_1734291713697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_finetuned_marc_english_fancorparation","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_finetuned_marc_english_fancorparation", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_marc_english_fancorparation| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|813.0 MB| + +## References + +https://huggingface.co/Fancorparation/xlm-roberta-base-finetuned-marc-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline_en.md new file mode 100644 index 00000000000000..9422573a9835b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline pipeline XlmRoBertaForSequenceClassification from Fancorparation +author: John Snow Labs +name: xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline` is a English model originally trained by Fancorparation. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline_en_5.5.1_3.0_1734291825951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline_en_5.5.1_3.0_1734291825951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_marc_english_fancorparation_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|813.1 MB| + +## References + +https://huggingface.co/Fancorparation/xlm-roberta-base-finetuned-marc-en + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_language_detection_basque_english_spanish_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_language_detection_basque_english_spanish_en.md new file mode 100644 index 00000000000000..a663f63eac9487 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_language_detection_basque_english_spanish_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_language_detection_basque_english_spanish XlmRoBertaForSequenceClassification from jusKnows +author: John Snow Labs +name: xlm_roberta_base_language_detection_basque_english_spanish +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_language_detection_basque_english_spanish` is a English model originally trained by jusKnows. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_language_detection_basque_english_spanish_en_5.5.1_3.0_1734293102352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_language_detection_basque_english_spanish_en_5.5.1_3.0_1734293102352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_language_detection_basque_english_spanish","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_language_detection_basque_english_spanish", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_language_detection_basque_english_spanish| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|858.1 MB| + +## References + +https://huggingface.co/jusKnows/xlm-roberta-base-language-detection-eu_en_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_language_detection_basque_english_spanish_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_language_detection_basque_english_spanish_pipeline_en.md new file mode 100644 index 00000000000000..2b9047e087d965 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_language_detection_basque_english_spanish_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_language_detection_basque_english_spanish_pipeline pipeline XlmRoBertaForSequenceClassification from jusKnows +author: John Snow Labs +name: xlm_roberta_base_language_detection_basque_english_spanish_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_language_detection_basque_english_spanish_pipeline` is a English model originally trained by jusKnows. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_language_detection_basque_english_spanish_pipeline_en_5.5.1_3.0_1734293217635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_language_detection_basque_english_spanish_pipeline_en_5.5.1_3.0_1734293217635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_language_detection_basque_english_spanish_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_language_detection_basque_english_spanish_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_language_detection_basque_english_spanish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|858.1 MB| + +## References + +https://huggingface.co/jusKnows/xlm-roberta-base-language-detection-eu_en_es + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_en.md new file mode 100644 index 00000000000000..bca9441dac6b93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1 XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_en_5.5.1_3.0_1734294251039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_en_5.5.1_3.0_1734294251039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|798.2 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-New_VietNam-aug_replace_tfidf-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline_en.md new file mode 100644 index 00000000000000..4b479a043054b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline pipeline XlmRoBertaForSequenceClassification from ThuyNT03 +author: John Snow Labs +name: xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline_en_5.5.1_3.0_1734294371524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline_en_5.5.1_3.0_1734294371524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_nepal_bhasa_vietnam_aug_replace_tfidf_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|798.2 MB| + +## References + +https://huggingface.co/ThuyNT03/xlm-roberta-base-New_VietNam-aug_replace_tfidf-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_stress_identification_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_stress_identification_en.md new file mode 100644 index 00000000000000..13eb4af18e2cc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_stress_identification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_stress_identification XlmRoBertaForSequenceClassification from mdosama39 +author: John Snow Labs +name: xlm_roberta_base_stress_identification +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_stress_identification` is a English model originally trained by mdosama39. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_stress_identification_en_5.5.1_3.0_1734294458603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_stress_identification_en_5.5.1_3.0_1734294458603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_stress_identification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_stress_identification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_stress_identification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|810.4 MB| + +## References + +https://huggingface.co/mdosama39/xlm-roberta-base-Stress-identification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_stress_identification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_stress_identification_pipeline_en.md new file mode 100644 index 00000000000000..81108c80e4d72f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_stress_identification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_stress_identification_pipeline pipeline XlmRoBertaForSequenceClassification from mdosama39 +author: John Snow Labs +name: xlm_roberta_base_stress_identification_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_stress_identification_pipeline` is a English model originally trained by mdosama39. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_stress_identification_pipeline_en_5.5.1_3.0_1734294572723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_stress_identification_pipeline_en_5.5.1_3.0_1734294572723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_stress_identification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_stress_identification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_stress_identification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|810.5 MB| + +## References + +https://huggingface.co/mdosama39/xlm-roberta-base-Stress-identification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vsfc_10_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vsfc_10_en.md new file mode 100644 index 00000000000000..2226955c37c516 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vsfc_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_vsfc_10 XlmRoBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: xlm_roberta_base_vsfc_10 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_vsfc_10` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vsfc_10_en_5.5.1_3.0_1734293406532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vsfc_10_en_5.5.1_3.0_1734293406532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_vsfc_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_vsfc_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_vsfc_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|773.7 MB| + +## References + +https://huggingface.co/tmnam20/xlm-roberta-base-vsfc-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vsfc_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vsfc_10_pipeline_en.md new file mode 100644 index 00000000000000..4a631d63e74a20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vsfc_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_vsfc_10_pipeline pipeline XlmRoBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: xlm_roberta_base_vsfc_10_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_vsfc_10_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vsfc_10_pipeline_en_5.5.1_3.0_1734293545386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vsfc_10_pipeline_en_5.5.1_3.0_1734293545386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_vsfc_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_vsfc_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_vsfc_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|773.7 MB| + +## References + +https://huggingface.co/tmnam20/xlm-roberta-base-vsfc-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vtoc_1_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vtoc_1_en.md new file mode 100644 index 00000000000000..913d5e0ae3dbe4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vtoc_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_vtoc_1 XlmRoBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: xlm_roberta_base_vtoc_1 +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_vtoc_1` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vtoc_1_en_5.5.1_3.0_1734294003512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vtoc_1_en_5.5.1_3.0_1734294003512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_vtoc_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlm_roberta_base_vtoc_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_vtoc_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|783.0 MB| + +## References + +https://huggingface.co/tmnam20/xlm-roberta-base-vtoc-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vtoc_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vtoc_1_pipeline_en.md new file mode 100644 index 00000000000000..a92e815e9f9d41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlm_roberta_base_vtoc_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_vtoc_1_pipeline pipeline XlmRoBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: xlm_roberta_base_vtoc_1_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_vtoc_1_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vtoc_1_pipeline_en_5.5.1_3.0_1734294138815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_vtoc_1_pipeline_en_5.5.1_3.0_1734294138815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_vtoc_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_vtoc_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_vtoc_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|783.1 MB| + +## References + +https://huggingface.co/tmnam20/xlm-roberta-base-vtoc-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlmroberta_finetuned_semitic_languages_eval_rest14_english_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlmroberta_finetuned_semitic_languages_eval_rest14_english_en.md new file mode 100644 index 00000000000000..b5dc0a8a6783de --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlmroberta_finetuned_semitic_languages_eval_rest14_english_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlmroberta_finetuned_semitic_languages_eval_rest14_english XlmRoBertaForSequenceClassification from car13mesquita +author: John Snow Labs +name: xlmroberta_finetuned_semitic_languages_eval_rest14_english +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlmroberta_finetuned_semitic_languages_eval_rest14_english` is a English model originally trained by car13mesquita. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlmroberta_finetuned_semitic_languages_eval_rest14_english_en_5.5.1_3.0_1734291700246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlmroberta_finetuned_semitic_languages_eval_rest14_english_en_5.5.1_3.0_1734291700246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlmroberta_finetuned_semitic_languages_eval_rest14_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xlmroberta_finetuned_semitic_languages_eval_rest14_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlmroberta_finetuned_semitic_languages_eval_rest14_english| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|802.3 MB| + +## References + +https://huggingface.co/car13mesquita/xlmroberta-finetuned-sem_eval-rest14-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline_en.md new file mode 100644 index 00000000000000..60c67eb895abb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline pipeline XlmRoBertaForSequenceClassification from car13mesquita +author: John Snow Labs +name: xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline` is a English model originally trained by car13mesquita. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline_en_5.5.1_3.0_1734291817483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline_en_5.5.1_3.0_1734291817483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlmroberta_finetuned_semitic_languages_eval_rest14_english_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|802.3 MB| + +## References + +https://huggingface.co/car13mesquita/xlmroberta-finetuned-sem_eval-rest14-english + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xmlroberta_pan12_gendata_en.md b/docs/_posts/ahmedlone127/2024-12-15-xmlroberta_pan12_gendata_en.md new file mode 100644 index 00000000000000..3959902a1c73ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xmlroberta_pan12_gendata_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xmlroberta_pan12_gendata XlmRoBertaForSequenceClassification from Constien +author: John Snow Labs +name: xmlroberta_pan12_gendata +date: 2024-12-15 +tags: [en, open_source, onnx, sequence_classification, xlm_roberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xmlroberta_pan12_gendata` is a English model originally trained by Constien. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xmlroberta_pan12_gendata_en_5.5.1_3.0_1734293254953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xmlroberta_pan12_gendata_en_5.5.1_3.0_1734293254953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xmlroberta_pan12_gendata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = XlmRoBertaForSequenceClassification.pretrained("xmlroberta_pan12_gendata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xmlroberta_pan12_gendata| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|803.5 MB| + +## References + +https://huggingface.co/Constien/xmlRoberta_Pan12_GenData \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-15-xmlroberta_pan12_gendata_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-15-xmlroberta_pan12_gendata_pipeline_en.md new file mode 100644 index 00000000000000..a8bb4d963cf150 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-15-xmlroberta_pan12_gendata_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xmlroberta_pan12_gendata_pipeline pipeline XlmRoBertaForSequenceClassification from Constien +author: John Snow Labs +name: xmlroberta_pan12_gendata_pipeline +date: 2024-12-15 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xmlroberta_pan12_gendata_pipeline` is a English model originally trained by Constien. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xmlroberta_pan12_gendata_pipeline_en_5.5.1_3.0_1734293384094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xmlroberta_pan12_gendata_pipeline_en_5.5.1_3.0_1734293384094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xmlroberta_pan12_gendata_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xmlroberta_pan12_gendata_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xmlroberta_pan12_gendata_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|803.6 MB| + +## References + +https://huggingface.co/Constien/xmlRoberta_Pan12_GenData + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-579_stmodel_v1_en.md b/docs/_posts/ahmedlone127/2024-12-16-579_stmodel_v1_en.md new file mode 100644 index 00000000000000..4b026cfa248f3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-579_stmodel_v1_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English 579_stmodel_v1 MPNetEmbeddings from jamiehudson +author: John Snow Labs +name: 579_stmodel_v1 +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`579_stmodel_v1` is a English model originally trained by jamiehudson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/579_stmodel_v1_en_5.5.1_3.0_1734316408701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/579_stmodel_v1_en_5.5.1_3.0_1734316408701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("579_stmodel_v1","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("579_stmodel_v1","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|579_stmodel_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jamiehudson/579-STmodel-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-579_stmodel_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-579_stmodel_v1_pipeline_en.md new file mode 100644 index 00000000000000..0958068cedb748 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-579_stmodel_v1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English 579_stmodel_v1_pipeline pipeline MPNetEmbeddings from jamiehudson +author: John Snow Labs +name: 579_stmodel_v1_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`579_stmodel_v1_pipeline` is a English model originally trained by jamiehudson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/579_stmodel_v1_pipeline_en_5.5.1_3.0_1734316440472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/579_stmodel_v1_pipeline_en_5.5.1_3.0_1734316440472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("579_stmodel_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("579_stmodel_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|579_stmodel_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/jamiehudson/579-STmodel-v1 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-8_shot_sta_head_trained_en.md b/docs/_posts/ahmedlone127/2024-12-16-8_shot_sta_head_trained_en.md new file mode 100644 index 00000000000000..b173f21fca98df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-8_shot_sta_head_trained_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English 8_shot_sta_head_trained MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: 8_shot_sta_head_trained +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`8_shot_sta_head_trained` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/8_shot_sta_head_trained_en_5.5.1_3.0_1734316408793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/8_shot_sta_head_trained_en_5.5.1_3.0_1734316408793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("8_shot_sta_head_trained","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("8_shot_sta_head_trained","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|8_shot_sta_head_trained| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/Nhat1904/8_shot_STA_head_trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-8_shot_sta_head_trained_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-8_shot_sta_head_trained_pipeline_en.md new file mode 100644 index 00000000000000..ddbb460a278edf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-8_shot_sta_head_trained_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English 8_shot_sta_head_trained_pipeline pipeline MPNetEmbeddings from Nhat1904 +author: John Snow Labs +name: 8_shot_sta_head_trained_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`8_shot_sta_head_trained_pipeline` is a English model originally trained by Nhat1904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/8_shot_sta_head_trained_pipeline_en_5.5.1_3.0_1734316446090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/8_shot_sta_head_trained_pipeline_en_5.5.1_3.0_1734316446090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("8_shot_sta_head_trained_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("8_shot_sta_head_trained_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|8_shot_sta_head_trained_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/Nhat1904/8_shot_STA_head_trained + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-absa_hos_multila_en.md b/docs/_posts/ahmedlone127/2024-12-16-absa_hos_multila_en.md new file mode 100644 index 00000000000000..bf0d4e3034350b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-absa_hos_multila_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English absa_hos_multila T5Transformer from thanmj +author: John Snow Labs +name: absa_hos_multila +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absa_hos_multila` is a English model originally trained by thanmj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absa_hos_multila_en_5.5.1_3.0_1734328333368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absa_hos_multila_en_5.5.1_3.0_1734328333368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("absa_hos_multila","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("absa_hos_multila", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absa_hos_multila| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|993.9 MB| + +## References + +https://huggingface.co/thanmj/ABSA_HOS_MultiLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-absa_hos_multila_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-absa_hos_multila_pipeline_en.md new file mode 100644 index 00000000000000..66c417631477b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-absa_hos_multila_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English absa_hos_multila_pipeline pipeline T5Transformer from thanmj +author: John Snow Labs +name: absa_hos_multila_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absa_hos_multila_pipeline` is a English model originally trained by thanmj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absa_hos_multila_pipeline_en_5.5.1_3.0_1734328388782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absa_hos_multila_pipeline_en_5.5.1_3.0_1734328388782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("absa_hos_multila_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("absa_hos_multila_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absa_hos_multila_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|993.9 MB| + +## References + +https://huggingface.co/thanmj/ABSA_HOS_MultiLA + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-adapter_13classes_multi_label_en.md b/docs/_posts/ahmedlone127/2024-12-16-adapter_13classes_multi_label_en.md new file mode 100644 index 00000000000000..619178086171d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-adapter_13classes_multi_label_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English adapter_13classes_multi_label T5Transformer from CrisisNarratives +author: John Snow Labs +name: adapter_13classes_multi_label +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adapter_13classes_multi_label` is a English model originally trained by CrisisNarratives. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adapter_13classes_multi_label_en_5.5.1_3.0_1734332231906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adapter_13classes_multi_label_en_5.5.1_3.0_1734332231906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("adapter_13classes_multi_label","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("adapter_13classes_multi_label", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adapter_13classes_multi_label| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|520.8 MB| + +## References + +https://huggingface.co/CrisisNarratives/adapter-13classes-multi_label \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-adapter_13classes_multi_label_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-adapter_13classes_multi_label_pipeline_en.md new file mode 100644 index 00000000000000..549bb0eeacb5bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-adapter_13classes_multi_label_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English adapter_13classes_multi_label_pipeline pipeline T5Transformer from CrisisNarratives +author: John Snow Labs +name: adapter_13classes_multi_label_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adapter_13classes_multi_label_pipeline` is a English model originally trained by CrisisNarratives. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adapter_13classes_multi_label_pipeline_en_5.5.1_3.0_1734332402763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adapter_13classes_multi_label_pipeline_en_5.5.1_3.0_1734332402763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("adapter_13classes_multi_label_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("adapter_13classes_multi_label_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adapter_13classes_multi_label_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|520.8 MB| + +## References + +https://huggingface.co/CrisisNarratives/adapter-13classes-multi_label + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-adapter_9classes_single_label_en.md b/docs/_posts/ahmedlone127/2024-12-16-adapter_9classes_single_label_en.md new file mode 100644 index 00000000000000..d95f32f2bc8f88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-adapter_9classes_single_label_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English adapter_9classes_single_label T5Transformer from CrisisNarratives +author: John Snow Labs +name: adapter_9classes_single_label +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adapter_9classes_single_label` is a English model originally trained by CrisisNarratives. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adapter_9classes_single_label_en_5.5.1_3.0_1734327911768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adapter_9classes_single_label_en_5.5.1_3.0_1734327911768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("adapter_9classes_single_label","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("adapter_9classes_single_label", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adapter_9classes_single_label| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|520.8 MB| + +## References + +https://huggingface.co/CrisisNarratives/adapter-9classes-single_label \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-adapter_9classes_single_label_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-adapter_9classes_single_label_pipeline_en.md new file mode 100644 index 00000000000000..639042ad5ca8d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-adapter_9classes_single_label_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English adapter_9classes_single_label_pipeline pipeline T5Transformer from CrisisNarratives +author: John Snow Labs +name: adapter_9classes_single_label_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adapter_9classes_single_label_pipeline` is a English model originally trained by CrisisNarratives. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adapter_9classes_single_label_pipeline_en_5.5.1_3.0_1734328079323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adapter_9classes_single_label_pipeline_en_5.5.1_3.0_1734328079323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("adapter_9classes_single_label_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("adapter_9classes_single_label_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adapter_9classes_single_label_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|520.8 MB| + +## References + +https://huggingface.co/CrisisNarratives/adapter-9classes-single_label + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-affilgood_ner_multilingual_v2_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-12-16-affilgood_ner_multilingual_v2_pipeline_xx.md new file mode 100644 index 00000000000000..f7cf2eb85d598a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-affilgood_ner_multilingual_v2_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual affilgood_ner_multilingual_v2_pipeline pipeline XlmRoBertaForTokenClassification from nicolauduran45 +author: John Snow Labs +name: affilgood_ner_multilingual_v2_pipeline +date: 2024-12-16 +tags: [xx, open_source, pipeline, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`affilgood_ner_multilingual_v2_pipeline` is a Multilingual model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/affilgood_ner_multilingual_v2_pipeline_xx_5.5.1_3.0_1734322218576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/affilgood_ner_multilingual_v2_pipeline_xx_5.5.1_3.0_1734322218576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("affilgood_ner_multilingual_v2_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("affilgood_ner_multilingual_v2_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|affilgood_ner_multilingual_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|1.0 GB| + +## References + +https://huggingface.co/nicolauduran45/affilgood-ner-multilingual-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-affilgood_ner_multilingual_v2_xx.md b/docs/_posts/ahmedlone127/2024-12-16-affilgood_ner_multilingual_v2_xx.md new file mode 100644 index 00000000000000..a310479580c557 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-affilgood_ner_multilingual_v2_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual affilgood_ner_multilingual_v2 XlmRoBertaForTokenClassification from nicolauduran45 +author: John Snow Labs +name: affilgood_ner_multilingual_v2 +date: 2024-12-16 +tags: [xx, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`affilgood_ner_multilingual_v2` is a Multilingual model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/affilgood_ner_multilingual_v2_xx_5.5.1_3.0_1734322162450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/affilgood_ner_multilingual_v2_xx_5.5.1_3.0_1734322162450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("affilgood_ner_multilingual_v2","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("affilgood_ner_multilingual_v2", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|affilgood_ner_multilingual_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|1.0 GB| + +## References + +https://huggingface.co/nicolauduran45/affilgood-ner-multilingual-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ai_project_en.md b/docs/_posts/ahmedlone127/2024-12-16-ai_project_en.md new file mode 100644 index 00000000000000..1a1b0c9fe6f737 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ai_project_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English ai_project T5Transformer from Karan-21 +author: John Snow Labs +name: ai_project +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_project` is a English model originally trained by Karan-21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_project_en_5.5.1_3.0_1734327417727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_project_en_5.5.1_3.0_1734327417727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("ai_project","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("ai_project", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_project| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|994.4 MB| + +## References + +https://huggingface.co/Karan-21/ai_project \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ai_project_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ai_project_pipeline_en.md new file mode 100644 index 00000000000000..f87625517faa29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ai_project_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English ai_project_pipeline pipeline T5Transformer from Karan-21 +author: John Snow Labs +name: ai_project_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_project_pipeline` is a English model originally trained by Karan-21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_project_pipeline_en_5.5.1_3.0_1734327471185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_project_pipeline_en_5.5.1_3.0_1734327471185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ai_project_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ai_project_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_project_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|994.4 MB| + +## References + +https://huggingface.co/Karan-21/ai_project + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_modulepred_en.md b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_modulepred_en.md new file mode 100644 index 00000000000000..5d15c3f7aaa27b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_modulepred_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_modulepred MPNetEmbeddings from carnival13 +author: John Snow Labs +name: all_mpnet_base_v2_modulepred +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_modulepred` is a English model originally trained by carnival13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_modulepred_en_5.5.1_3.0_1734316409521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_modulepred_en_5.5.1_3.0_1734316409521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_modulepred","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_modulepred","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_modulepred| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/carnival13/all-mpnet-base-v2-modulepred \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_modulepred_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_modulepred_pipeline_en.md new file mode 100644 index 00000000000000..2554d084bf6a99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_modulepred_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_modulepred_pipeline pipeline MPNetEmbeddings from carnival13 +author: John Snow Labs +name: all_mpnet_base_v2_modulepred_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_modulepred_pipeline` is a English model originally trained by carnival13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_modulepred_pipeline_en_5.5.1_3.0_1734316446897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_modulepred_pipeline_en_5.5.1_3.0_1734316446897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_modulepred_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_modulepred_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_modulepred_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/carnival13/all-mpnet-base-v2-modulepred + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_patabs_1epoc_batch32_100_en.md b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_patabs_1epoc_batch32_100_en.md new file mode 100644 index 00000000000000..01ec322d71f674 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_patabs_1epoc_batch32_100_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English all_mpnet_base_v2_patabs_1epoc_batch32_100 MPNetEmbeddings from stephenhib +author: John Snow Labs +name: all_mpnet_base_v2_patabs_1epoc_batch32_100 +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_patabs_1epoc_batch32_100` is a English model originally trained by stephenhib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_patabs_1epoc_batch32_100_en_5.5.1_3.0_1734316541919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_patabs_1epoc_batch32_100_en_5.5.1_3.0_1734316541919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_patabs_1epoc_batch32_100","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("all_mpnet_base_v2_patabs_1epoc_batch32_100","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_patabs_1epoc_batch32_100| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/stephenhib/all-mpnet-base-v2-patabs-1epoc-batch32-100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline_en.md new file mode 100644 index 00000000000000..d6893933ae4589 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline pipeline MPNetEmbeddings from stephenhib +author: John Snow Labs +name: all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline` is a English model originally trained by stephenhib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline_en_5.5.1_3.0_1734316562390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline_en_5.5.1_3.0_1734316562390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_mpnet_base_v2_patabs_1epoc_batch32_100_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/stephenhib/all-mpnet-base-v2-patabs-1epoc-batch32-100 + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-autotrain_cat_dog_46040114726_en.md b/docs/_posts/ahmedlone127/2024-12-16-autotrain_cat_dog_46040114726_en.md new file mode 100644 index 00000000000000..ad7e3680a64217 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-autotrain_cat_dog_46040114726_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English autotrain_cat_dog_46040114726 SwinForImageClassification from billster45 +author: John Snow Labs +name: autotrain_cat_dog_46040114726 +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_cat_dog_46040114726` is a English model originally trained by billster45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_cat_dog_46040114726_en_5.5.1_3.0_1734325038650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_cat_dog_46040114726_en_5.5.1_3.0_1734325038650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""autotrain_cat_dog_46040114726","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("autotrain_cat_dog_46040114726","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_cat_dog_46040114726| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/billster45/autotrain-cat_dog-46040114726 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-autotrain_cat_dog_46040114726_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-autotrain_cat_dog_46040114726_pipeline_en.md new file mode 100644 index 00000000000000..51ef0c578fbf53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-autotrain_cat_dog_46040114726_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English autotrain_cat_dog_46040114726_pipeline pipeline SwinForImageClassification from billster45 +author: John Snow Labs +name: autotrain_cat_dog_46040114726_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_cat_dog_46040114726_pipeline` is a English model originally trained by billster45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_cat_dog_46040114726_pipeline_en_5.5.1_3.0_1734325072145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_cat_dog_46040114726_pipeline_en_5.5.1_3.0_1734325072145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("autotrain_cat_dog_46040114726_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("autotrain_cat_dog_46040114726_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_cat_dog_46040114726_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/billster45/autotrain-cat_dog-46040114726 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-baseline001_noqa_20230913_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-baseline001_noqa_20230913_pipeline_en.md new file mode 100644 index 00000000000000..4026536ce6fd6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-baseline001_noqa_20230913_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English baseline001_noqa_20230913_pipeline pipeline BertForQuestionAnswering from intanm +author: John Snow Labs +name: baseline001_noqa_20230913_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`baseline001_noqa_20230913_pipeline` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/baseline001_noqa_20230913_pipeline_en_5.5.1_3.0_1734339115450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/baseline001_noqa_20230913_pipeline_en_5.5.1_3.0_1734339115450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("baseline001_noqa_20230913_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("baseline001_noqa_20230913_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|baseline001_noqa_20230913_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/intanm/baseline001-noQA-20230913 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_chinese_finetuned_cmrc2018_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_chinese_finetuned_cmrc2018_en.md new file mode 100644 index 00000000000000..5cfbda7835d960 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_chinese_finetuned_cmrc2018_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_cmrc2018 BertForQuestionAnswering from real-jiakai +author: John Snow Labs +name: bert_base_chinese_finetuned_cmrc2018 +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_cmrc2018` is a English model originally trained by real-jiakai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_cmrc2018_en_5.5.1_3.0_1734338932534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_cmrc2018_en_5.5.1_3.0_1734338932534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_chinese_finetuned_cmrc2018","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_chinese_finetuned_cmrc2018", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_cmrc2018| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/real-jiakai/bert-base-chinese-finetuned-cmrc2018 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_spanish_wwm_cased_finetuned_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_spanish_wwm_cased_finetuned_ner_pipeline_en.md new file mode 100644 index 00000000000000..39e8f5c3ddfbdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_spanish_wwm_cased_finetuned_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_ner_pipeline pipeline BertForTokenClassification from raulgdp +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_ner_pipeline` is a English model originally trained by raulgdp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_ner_pipeline_en_5.5.1_3.0_1734337563102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_ner_pipeline_en_5.5.1_3.0_1734337563102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_spanish_wwm_cased_finetuned_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_spanish_wwm_cased_finetuned_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/raulgdp/bert-base-spanish-wwm-cased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_train_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_train_en.md new file mode 100644 index 00000000000000..60160b22a517ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_train_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_base_train DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: bert_base_train +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_train` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_train_en_5.5.1_3.0_1734307564891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_train_en_5.5.1_3.0_1734307564891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("bert_base_train","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("bert_base_train","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_train| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/bert_base_train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_train_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_train_pipeline_en.md new file mode 100644 index 00000000000000..ca6ff661ef86a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_train_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_base_train_pipeline pipeline DistilBertEmbeddings from gokulsrinivasagan +author: John Snow Labs +name: bert_base_train_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_train_pipeline` is a English model originally trained by gokulsrinivasagan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_train_pipeline_en_5.5.1_3.0_1734307584784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_train_pipeline_en_5.5.1_3.0_1734307584784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_train_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_train_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_train_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/gokulsrinivasagan/bert_base_train + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0_en.md new file mode 100644 index 00000000000000..35beee6b864367 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0 +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0_en_5.5.1_3.0_1734338845956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0_en_5.5.1_3.0_1734338845956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_0001_swati_0| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-1e-05-wd-0.001-dp-0.0001-ss-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5_en.md new file mode 100644 index 00000000000000..3a90b34b896114 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5 +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5_en_5.5.1_3.0_1734339214155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5_en_5.5.1_3.0_1734339214155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_1e_05_wd_0_001_dp_0_5| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-1e-05-wd-0.001-dp-0.5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000_en.md new file mode 100644 index 00000000000000..43f91a4e2473bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000 +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000_en_5.5.1_3.0_1734338754227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000_en_5.5.1_3.0_1734338754227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_1e_06_wd_0_001_dp_0_99999_swati_110000| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-1e-06-wd-0.001-dp-0.99999-ss-110000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline_en.md new file mode 100644 index 00000000000000..1cc5245be45148 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline pipeline BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline_en_5.5.1_3.0_1734339248464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline_en_5.5.1_3.0_1734339248464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_2_0_lr_5e_05_wd_0_001_dp_0_6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-2.0-lr-5e-05-wd-0.001-dp-0.6 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0_en.md new file mode 100644 index 00000000000000..adc7a89cab79a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0 +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0_en_5.5.1_3.0_1734339056470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0_en_5.5.1_3.0_1734339056470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_5_0_lr_1e_05_wd_0_001_dp_0_01_swati_0| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-5.0-lr-1e-05-wd-0.001-dp-0.01-ss-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_ner_football_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_ner_football_en.md new file mode 100644 index 00000000000000..7323664ef6b3b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_ner_football_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_finetuned_ner_football DistilBertForTokenClassification from ChristianSneffeFleischer +author: John Snow Labs +name: bert_finetuned_ner_football +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_football` is a English model originally trained by ChristianSneffeFleischer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_football_en_5.5.1_3.0_1734310331639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_football_en_5.5.1_3.0_1734310331639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_football","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_football", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_football| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/ChristianSneffeFleischer/bert-finetuned-ner-football \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_ner_football_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_ner_football_pipeline_en.md new file mode 100644 index 00000000000000..eef5c6f4488eaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_ner_football_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bert_finetuned_ner_football_pipeline pipeline DistilBertForTokenClassification from ChristianSneffeFleischer +author: John Snow Labs +name: bert_finetuned_ner_football_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_football_pipeline` is a English model originally trained by ChristianSneffeFleischer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_football_pipeline_en_5.5.1_3.0_1734310344164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_football_pipeline_en_5.5.1_3.0_1734310344164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_ner_football_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_ner_football_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_football_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/ChristianSneffeFleischer/bert-finetuned-ner-football + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_boimbukanbaim_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_boimbukanbaim_en.md new file mode 100644 index 00000000000000..63be8a97e1099e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_boimbukanbaim_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_finetuned_squad_boimbukanbaim BertForQuestionAnswering from boimbukanbaim +author: John Snow Labs +name: bert_finetuned_squad_boimbukanbaim +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_boimbukanbaim` is a English model originally trained by boimbukanbaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_boimbukanbaim_en_5.5.1_3.0_1734338360713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_boimbukanbaim_en_5.5.1_3.0_1734338360713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_boimbukanbaim","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_boimbukanbaim", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_boimbukanbaim| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/boimbukanbaim/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_boimbukanbaim_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_boimbukanbaim_pipeline_en.md new file mode 100644 index 00000000000000..9358456ca5ccad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_boimbukanbaim_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_finetuned_squad_boimbukanbaim_pipeline pipeline BertForQuestionAnswering from boimbukanbaim +author: John Snow Labs +name: bert_finetuned_squad_boimbukanbaim_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_boimbukanbaim_pipeline` is a English model originally trained by boimbukanbaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_boimbukanbaim_pipeline_en_5.5.1_3.0_1734338389726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_boimbukanbaim_pipeline_en_5.5.1_3.0_1734338389726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_squad_boimbukanbaim_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_squad_boimbukanbaim_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_boimbukanbaim_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/boimbukanbaim/bert-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_prabhatsingh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_prabhatsingh_pipeline_en.md new file mode 100644 index 00000000000000..d0e56bebe837f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_prabhatsingh_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_finetuned_squad_prabhatsingh_pipeline pipeline BertForQuestionAnswering from prabhatsingh +author: John Snow Labs +name: bert_finetuned_squad_prabhatsingh_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_prabhatsingh_pipeline` is a English model originally trained by prabhatsingh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_prabhatsingh_pipeline_en_5.5.1_3.0_1734338803682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_prabhatsingh_pipeline_en_5.5.1_3.0_1734338803682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_finetuned_squad_prabhatsingh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_finetuned_squad_prabhatsingh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_prabhatsingh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/prabhatsingh/bert-finetuned-squad + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_saadon_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_saadon_en.md new file mode 100644 index 00000000000000..0f19e9cb2e980d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_finetuned_squad_saadon_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_finetuned_squad_saadon BertForQuestionAnswering from SaadoN +author: John Snow Labs +name: bert_finetuned_squad_saadon +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_saadon` is a English model originally trained by SaadoN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_saadon_en_5.5.1_3.0_1734338283353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_saadon_en_5.5.1_3.0_1734338283353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_saadon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_saadon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_saadon| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/SaadoN/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_large_uncased_whole_word_masking_finetuned_squad_training_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_large_uncased_whole_word_masking_finetuned_squad_training_en.md new file mode 100644 index 00000000000000..8a1da704a82215 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_large_uncased_whole_word_masking_finetuned_squad_training_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bert_large_uncased_whole_word_masking_finetuned_squad_training BertForQuestionAnswering from aaditya +author: John Snow Labs +name: bert_large_uncased_whole_word_masking_finetuned_squad_training +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_whole_word_masking_finetuned_squad_training` is a English model originally trained by aaditya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_squad_training_en_5.5.1_3.0_1734338584265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_squad_training_en_5.5.1_3.0_1734338584265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_large_uncased_whole_word_masking_finetuned_squad_training","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("bert_large_uncased_whole_word_masking_finetuned_squad_training", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_whole_word_masking_finetuned_squad_training| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/aaditya/bert-large-uncased-whole-word-masking-finetuned-squad_training \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline_en.md new file mode 100644 index 00000000000000..9a4af4adedfb84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline pipeline BertForQuestionAnswering from aaditya +author: John Snow Labs +name: bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline` is a English model originally trained by aaditya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline_en_5.5.1_3.0_1734338647725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline_en_5.5.1_3.0_1734338647725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_whole_word_masking_finetuned_squad_training_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/aaditya/bert-large-uncased-whole-word-masking-finetuned-squad_training + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_russian_pe_ner_epoch15_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_russian_pe_ner_epoch15_en.md new file mode 100644 index 00000000000000..76ad42b91a51b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_russian_pe_ner_epoch15_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_russian_pe_ner_epoch15 BertForTokenClassification from Poulami +author: John Snow Labs +name: bert_russian_pe_ner_epoch15 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_russian_pe_ner_epoch15` is a English model originally trained by Poulami. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_russian_pe_ner_epoch15_en_5.5.1_3.0_1734336884312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_russian_pe_ner_epoch15_en_5.5.1_3.0_1734336884312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("bert_russian_pe_ner_epoch15","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_russian_pe_ner_epoch15", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_russian_pe_ner_epoch15| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/Poulami/bert-ru-PE-NER_epoch15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_squad_qa_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_squad_qa_pipeline_en.md new file mode 100644 index 00000000000000..b1129e033bf23e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_squad_qa_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bert_squad_qa_pipeline pipeline BertForQuestionAnswering from Abdo36 +author: John Snow Labs +name: bert_squad_qa_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_squad_qa_pipeline` is a English model originally trained by Abdo36. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_squad_qa_pipeline_en_5.5.1_3.0_1734339428137.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_squad_qa_pipeline_en_5.5.1_3.0_1734339428137.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bert_squad_qa_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bert_squad_qa_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_squad_qa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Abdo36/Bert-SquAD-QA + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bert_tiny_ontonotes_en.md b/docs/_posts/ahmedlone127/2024-12-16-bert_tiny_ontonotes_en.md new file mode 100644 index 00000000000000..8ac245ebf82b19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bert_tiny_ontonotes_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English bert_tiny_ontonotes BertForTokenClassification from arnabdhar +author: John Snow Labs +name: bert_tiny_ontonotes +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_ontonotes` is a English model originally trained by arnabdhar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_ontonotes_en_5.5.1_3.0_1734336715138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_ontonotes_en_5.5.1_3.0_1734336715138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_ontonotes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_ontonotes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_ontonotes| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/arnabdhar/bert-tiny-ontonotes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bertner_biobert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bertner_biobert_pipeline_en.md new file mode 100644 index 00000000000000..0b4d2b781f0b25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bertner_biobert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English bertner_biobert_pipeline pipeline BertForTokenClassification from Vantwoth +author: John Snow Labs +name: bertner_biobert_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertner_biobert_pipeline` is a English model originally trained by Vantwoth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertner_biobert_pipeline_en_5.5.1_3.0_1734337539697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertner_biobert_pipeline_en_5.5.1_3.0_1734337539697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bertner_biobert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bertner_biobert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertner_biobert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|256.6 MB| + +## References + +https://huggingface.co/Vantwoth/bertNer-biobert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-beto_biobert_en.md b/docs/_posts/ahmedlone127/2024-12-16-beto_biobert_en.md new file mode 100644 index 00000000000000..7916e3eced104f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-beto_biobert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English beto_biobert BertForTokenClassification from stivenacua17 +author: John Snow Labs +name: beto_biobert +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_biobert` is a English model originally trained by stivenacua17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_biobert_en_5.5.1_3.0_1734336547371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_biobert_en_5.5.1_3.0_1734336547371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("beto_biobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("beto_biobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_biobert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/stivenacua17/beto-biobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-beto_biobert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-beto_biobert_pipeline_en.md new file mode 100644 index 00000000000000..26144dc000d8a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-beto_biobert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English beto_biobert_pipeline pipeline BertForTokenClassification from stivenacua17 +author: John Snow Labs +name: beto_biobert_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_biobert_pipeline` is a English model originally trained by stivenacua17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_biobert_pipeline_en_5.5.1_3.0_1734336568749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_biobert_pipeline_en_5.5.1_3.0_1734336568749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("beto_biobert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("beto_biobert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_biobert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/stivenacua17/beto-biobert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-beto_prescripciones_medicas_es.md b/docs/_posts/ahmedlone127/2024-12-16-beto_prescripciones_medicas_es.md new file mode 100644 index 00000000000000..9496f34d21fc04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-beto_prescripciones_medicas_es.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Castilian, Spanish beto_prescripciones_medicas BertForTokenClassification from ccarvajal +author: John Snow Labs +name: beto_prescripciones_medicas +date: 2024-12-16 +tags: [es, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_prescripciones_medicas` is a Castilian, Spanish model originally trained by ccarvajal. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_prescripciones_medicas_es_5.5.1_3.0_1734337424276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_prescripciones_medicas_es_5.5.1_3.0_1734337424276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("beto_prescripciones_medicas","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("beto_prescripciones_medicas", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_prescripciones_medicas| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.5 MB| + +## References + +References + +https://huggingface.co/ccarvajal/beto-prescripciones-medicas \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-beto_prescripciones_medicas_pipeline_es.md b/docs/_posts/ahmedlone127/2024-12-16-beto_prescripciones_medicas_pipeline_es.md new file mode 100644 index 00000000000000..a081a63eb8ba53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-beto_prescripciones_medicas_pipeline_es.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Castilian, Spanish beto_prescripciones_medicas_pipeline pipeline BertForTokenClassification from ccarvajal-reyes +author: John Snow Labs +name: beto_prescripciones_medicas_pipeline +date: 2024-12-16 +tags: [es, open_source, pipeline, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_prescripciones_medicas_pipeline` is a Castilian, Spanish model originally trained by ccarvajal-reyes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_prescripciones_medicas_pipeline_es_5.5.1_3.0_1734337445118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_prescripciones_medicas_pipeline_es_5.5.1_3.0_1734337445118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("beto_prescripciones_medicas_pipeline", lang = "es") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("beto_prescripciones_medicas_pipeline", lang = "es") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_prescripciones_medicas_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/ccarvajal-reyes/beto-prescripciones-medicas + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_en.md b/docs/_posts/ahmedlone127/2024-12-16-bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_en.md new file mode 100644 index 00000000000000..ce41537eb800d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k T5Transformer from MittyN +author: John Snow Labs +name: bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k` is a English model originally trained by MittyN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_en_5.5.1_3.0_1734331632500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_en_5.5.1_3.0_1734331632500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|2.3 GB| + +## References + +https://huggingface.co/MittyN/bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline_en.md new file mode 100644 index 00000000000000..0206969c3157c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline pipeline T5Transformer from MittyN +author: John Snow Labs +name: bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline` is a English model originally trained by MittyN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline_en_5.5.1_3.0_1734331838176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline_en_5.5.1_3.0_1734331838176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bslm_entity_extraction_mt5_base_include_desc_normalized_tr243k_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.3 GB| + +## References + +https://huggingface.co/MittyN/bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_dailymail_baseline_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_dailymail_baseline_model_en.md new file mode 100644 index 00000000000000..d0c3ec6e9cfc09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_dailymail_baseline_model_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_awesome_dailymail_baseline_model T5Transformer from Zlovoblachko +author: John Snow Labs +name: burmese_awesome_dailymail_baseline_model +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_dailymail_baseline_model` is a English model originally trained by Zlovoblachko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_dailymail_baseline_model_en_5.5.1_3.0_1734332193831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_dailymail_baseline_model_en_5.5.1_3.0_1734332193831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("burmese_awesome_dailymail_baseline_model","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("burmese_awesome_dailymail_baseline_model", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_dailymail_baseline_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|334.5 MB| + +## References + +https://huggingface.co/Zlovoblachko/my_awesome_dailymail_baseline_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_dailymail_baseline_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_dailymail_baseline_model_pipeline_en.md new file mode 100644 index 00000000000000..94b69c54106efe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_dailymail_baseline_model_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_awesome_dailymail_baseline_model_pipeline pipeline T5Transformer from Zlovoblachko +author: John Snow Labs +name: burmese_awesome_dailymail_baseline_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_dailymail_baseline_model_pipeline` is a English model originally trained by Zlovoblachko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_dailymail_baseline_model_pipeline_en_5.5.1_3.0_1734332214496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_dailymail_baseline_model_pipeline_en_5.5.1_3.0_1734332214496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_dailymail_baseline_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_dailymail_baseline_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_dailymail_baseline_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|334.5 MB| + +## References + +https://huggingface.co/Zlovoblachko/my_awesome_dailymail_baseline_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_agaresd_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_agaresd_en.md new file mode 100644 index 00000000000000..e924ac433ded20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_agaresd_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_agaresd T5Transformer from agaresd +author: John Snow Labs +name: burmese_awesome_opus_books_model_agaresd +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_agaresd` is a English model originally trained by agaresd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_agaresd_en_5.5.1_3.0_1734328817442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_agaresd_en_5.5.1_3.0_1734328817442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_agaresd","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_agaresd", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_agaresd| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|340.4 MB| + +## References + +https://huggingface.co/agaresd/my_awesome_opus_books_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_agaresd_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_agaresd_pipeline_en.md new file mode 100644 index 00000000000000..f0a6149b3dfecc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_agaresd_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_agaresd_pipeline pipeline T5Transformer from agaresd +author: John Snow Labs +name: burmese_awesome_opus_books_model_agaresd_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_agaresd_pipeline` is a English model originally trained by agaresd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_agaresd_pipeline_en_5.5.1_3.0_1734328836683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_agaresd_pipeline_en_5.5.1_3.0_1734328836683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_opus_books_model_agaresd_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_opus_books_model_agaresd_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_agaresd_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|340.4 MB| + +## References + +https://huggingface.co/agaresd/my_awesome_opus_books_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_paulusfmx_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_paulusfmx_en.md new file mode 100644 index 00000000000000..d3733b6e3aea03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_paulusfmx_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_paulusfmx T5Transformer from Paulusfmx +author: John Snow Labs +name: burmese_awesome_opus_books_model_paulusfmx +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_paulusfmx` is a English model originally trained by Paulusfmx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_paulusfmx_en_5.5.1_3.0_1734327127413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_paulusfmx_en_5.5.1_3.0_1734327127413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_paulusfmx","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_paulusfmx", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_paulusfmx| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|341.9 MB| + +## References + +https://huggingface.co/Paulusfmx/my_awesome_opus_books_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_paulusfmx_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_paulusfmx_pipeline_en.md new file mode 100644 index 00000000000000..45f88b7de7769a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_paulusfmx_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_paulusfmx_pipeline pipeline T5Transformer from Paulusfmx +author: John Snow Labs +name: burmese_awesome_opus_books_model_paulusfmx_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_paulusfmx_pipeline` is a English model originally trained by Paulusfmx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_paulusfmx_pipeline_en_5.5.1_3.0_1734327149406.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_paulusfmx_pipeline_en_5.5.1_3.0_1734327149406.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_opus_books_model_paulusfmx_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_opus_books_model_paulusfmx_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_paulusfmx_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|341.9 MB| + +## References + +https://huggingface.co/Paulusfmx/my_awesome_opus_books_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_sarveshchaudhari_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_sarveshchaudhari_en.md new file mode 100644 index 00000000000000..36cf271ae8f473 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_sarveshchaudhari_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_sarveshchaudhari T5Transformer from sarveshchaudhari +author: John Snow Labs +name: burmese_awesome_opus_books_model_sarveshchaudhari +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_sarveshchaudhari` is a English model originally trained by sarveshchaudhari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_sarveshchaudhari_en_5.5.1_3.0_1734332052244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_sarveshchaudhari_en_5.5.1_3.0_1734332052244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_sarveshchaudhari","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("burmese_awesome_opus_books_model_sarveshchaudhari", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_sarveshchaudhari| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|298.2 MB| + +## References + +https://huggingface.co/sarveshchaudhari/my_awesome_opus_books_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_sarveshchaudhari_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_sarveshchaudhari_pipeline_en.md new file mode 100644 index 00000000000000..a4d59e15f07e2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_opus_books_model_sarveshchaudhari_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_awesome_opus_books_model_sarveshchaudhari_pipeline pipeline T5Transformer from sarveshchaudhari +author: John Snow Labs +name: burmese_awesome_opus_books_model_sarveshchaudhari_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_opus_books_model_sarveshchaudhari_pipeline` is a English model originally trained by sarveshchaudhari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_sarveshchaudhari_pipeline_en_5.5.1_3.0_1734332077443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_opus_books_model_sarveshchaudhari_pipeline_en_5.5.1_3.0_1734332077443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_opus_books_model_sarveshchaudhari_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_opus_books_model_sarveshchaudhari_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_opus_books_model_sarveshchaudhari_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|298.2 MB| + +## References + +https://huggingface.co/sarveshchaudhari/my_awesome_opus_books_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_cotysong113_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_cotysong113_en.md new file mode 100644 index 00000000000000..652067d5e4ff0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_cotysong113_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_cotysong113 DistilBertForTokenClassification from cotysong113 +author: John Snow Labs +name: burmese_awesome_wnut_model_cotysong113 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_cotysong113` is a English model originally trained by cotysong113. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_cotysong113_en_5.5.1_3.0_1734310266305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_cotysong113_en_5.5.1_3.0_1734310266305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_cotysong113","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_cotysong113", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_cotysong113| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/cotysong113/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_cotysong113_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_cotysong113_pipeline_en.md new file mode 100644 index 00000000000000..2a7483946f0fe3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_cotysong113_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_cotysong113_pipeline pipeline DistilBertForTokenClassification from cotysong113 +author: John Snow Labs +name: burmese_awesome_wnut_model_cotysong113_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_cotysong113_pipeline` is a English model originally trained by cotysong113. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_cotysong113_pipeline_en_5.5.1_3.0_1734310280160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_cotysong113_pipeline_en_5.5.1_3.0_1734310280160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_wnut_model_cotysong113_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_wnut_model_cotysong113_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_cotysong113_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/cotysong113/my_awesome_wnut_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_masterkristall_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_masterkristall_en.md new file mode 100644 index 00000000000000..d4884a1576db23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_masterkristall_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_masterkristall DistilBertForTokenClassification from masterkristall +author: John Snow Labs +name: burmese_awesome_wnut_model_masterkristall +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_masterkristall` is a English model originally trained by masterkristall. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_masterkristall_en_5.5.1_3.0_1734310859916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_masterkristall_en_5.5.1_3.0_1734310859916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_masterkristall","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_masterkristall", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_masterkristall| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/masterkristall/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_masterkristall_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_masterkristall_pipeline_en.md new file mode 100644 index 00000000000000..96de18c9120a23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_masterkristall_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_masterkristall_pipeline pipeline DistilBertForTokenClassification from masterkristall +author: John Snow Labs +name: burmese_awesome_wnut_model_masterkristall_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_masterkristall_pipeline` is a English model originally trained by masterkristall. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_masterkristall_pipeline_en_5.5.1_3.0_1734310872692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_masterkristall_pipeline_en_5.5.1_3.0_1734310872692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_wnut_model_masterkristall_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_wnut_model_masterkristall_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_masterkristall_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/masterkristall/my_awesome_wnut_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_you_g_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_you_g_en.md new file mode 100644 index 00000000000000..8165dbf0829e4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_you_g_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_you_g DistilBertForTokenClassification from YOU-G +author: John Snow Labs +name: burmese_awesome_wnut_model_you_g +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_you_g` is a English model originally trained by YOU-G. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_you_g_en_5.5.1_3.0_1734310721036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_you_g_en_5.5.1_3.0_1734310721036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_you_g","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_you_g", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_you_g| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/YOU-G/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_you_g_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_you_g_pipeline_en.md new file mode 100644 index 00000000000000..dda433a39181d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_awesome_wnut_model_you_g_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_you_g_pipeline pipeline DistilBertForTokenClassification from YOU-G +author: John Snow Labs +name: burmese_awesome_wnut_model_you_g_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_you_g_pipeline` is a English model originally trained by YOU-G. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_you_g_pipeline_en_5.5.1_3.0_1734310733921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_you_g_pipeline_en_5.5.1_3.0_1734310733921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_awesome_wnut_model_you_g_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_awesome_wnut_model_you_g_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_you_g_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/YOU-G/my_awesome_wnut_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_bert_nepal_bhasa_version_6_0_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_bert_nepal_bhasa_version_6_0_pipeline_en.md new file mode 100644 index 00000000000000..6f313003a4cdc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_bert_nepal_bhasa_version_6_0_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_bert_nepal_bhasa_version_6_0_pipeline pipeline BertForQuestionAnswering from Ashkh0099 +author: John Snow Labs +name: burmese_bert_nepal_bhasa_version_6_0_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_bert_nepal_bhasa_version_6_0_pipeline` is a English model originally trained by Ashkh0099. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_bert_nepal_bhasa_version_6_0_pipeline_en_5.5.1_3.0_1734338819381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_bert_nepal_bhasa_version_6_0_pipeline_en_5.5.1_3.0_1734338819381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_bert_nepal_bhasa_version_6_0_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_bert_nepal_bhasa_version_6_0_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_bert_nepal_bhasa_version_6_0_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Ashkh0099/my-bert-new-version-6.0 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_fantastic_patent_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_fantastic_patent_model_en.md new file mode 100644 index 00000000000000..c65a69aa0d6fce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_fantastic_patent_model_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English burmese_fantastic_patent_model T5Transformer from dmen24 +author: John Snow Labs +name: burmese_fantastic_patent_model +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_fantastic_patent_model` is a English model originally trained by dmen24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_fantastic_patent_model_en_5.5.1_3.0_1734328247117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_fantastic_patent_model_en_5.5.1_3.0_1734328247117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("burmese_fantastic_patent_model","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("burmese_fantastic_patent_model", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_fantastic_patent_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|327.6 MB| + +## References + +https://huggingface.co/dmen24/my_fantastic_patent_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-burmese_fantastic_patent_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-burmese_fantastic_patent_model_pipeline_en.md new file mode 100644 index 00000000000000..03f28b26dba114 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-burmese_fantastic_patent_model_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English burmese_fantastic_patent_model_pipeline pipeline T5Transformer from dmen24 +author: John Snow Labs +name: burmese_fantastic_patent_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_fantastic_patent_model_pipeline` is a English model originally trained by dmen24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_fantastic_patent_model_pipeline_en_5.5.1_3.0_1734328268213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_fantastic_patent_model_pipeline_en_5.5.1_3.0_1734328268213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("burmese_fantastic_patent_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("burmese_fantastic_patent_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_fantastic_patent_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|327.6 MB| + +## References + +https://huggingface.co/dmen24/my_fantastic_patent_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-camelbert_msa_zaebuc_ged_43_ar.md b/docs/_posts/ahmedlone127/2024-12-16-camelbert_msa_zaebuc_ged_43_ar.md new file mode 100644 index 00000000000000..8799287db6327d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-camelbert_msa_zaebuc_ged_43_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic camelbert_msa_zaebuc_ged_43 BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: camelbert_msa_zaebuc_ged_43 +date: 2024-12-16 +tags: [ar, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camelbert_msa_zaebuc_ged_43` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camelbert_msa_zaebuc_ged_43_ar_5.5.1_3.0_1734337056089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camelbert_msa_zaebuc_ged_43_ar_5.5.1_3.0_1734337056089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("camelbert_msa_zaebuc_ged_43","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("camelbert_msa_zaebuc_ged_43", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camelbert_msa_zaebuc_ged_43| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.5 MB| + +## References + +https://huggingface.co/CAMeL-Lab/camelbert-msa-zaebuc-ged-43 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-camelbert_msa_zaebuc_ged_43_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-12-16-camelbert_msa_zaebuc_ged_43_pipeline_ar.md new file mode 100644 index 00000000000000..b8d695f3eeceac --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-camelbert_msa_zaebuc_ged_43_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic camelbert_msa_zaebuc_ged_43_pipeline pipeline BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: camelbert_msa_zaebuc_ged_43_pipeline +date: 2024-12-16 +tags: [ar, open_source, pipeline, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camelbert_msa_zaebuc_ged_43_pipeline` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camelbert_msa_zaebuc_ged_43_pipeline_ar_5.5.1_3.0_1734337076699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camelbert_msa_zaebuc_ged_43_pipeline_ar_5.5.1_3.0_1734337076699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("camelbert_msa_zaebuc_ged_43_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("camelbert_msa_zaebuc_ged_43_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camelbert_msa_zaebuc_ged_43_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|406.5 MB| + +## References + +https://huggingface.co/CAMeL-Lab/camelbert-msa-zaebuc-ged-43 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-camelbert_ner_arabic_ar.md b/docs/_posts/ahmedlone127/2024-12-16-camelbert_ner_arabic_ar.md new file mode 100644 index 00000000000000..e4de0e6a9f6163 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-camelbert_ner_arabic_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic camelbert_ner_arabic BertForTokenClassification from Tevfik-istanbullu +author: John Snow Labs +name: camelbert_ner_arabic +date: 2024-12-16 +tags: [ar, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camelbert_ner_arabic` is a Arabic model originally trained by Tevfik-istanbullu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camelbert_ner_arabic_ar_5.5.1_3.0_1734337418435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camelbert_ner_arabic_ar_5.5.1_3.0_1734337418435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("camelbert_ner_arabic","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("camelbert_ner_arabic", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camelbert_ner_arabic| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.6 MB| + +## References + +https://huggingface.co/Tevfik-istanbullu/camelbert-ner-arabic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-camelbert_ner_arabic_pipeline_ar.md b/docs/_posts/ahmedlone127/2024-12-16-camelbert_ner_arabic_pipeline_ar.md new file mode 100644 index 00000000000000..96b23a866ed8cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-camelbert_ner_arabic_pipeline_ar.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Arabic camelbert_ner_arabic_pipeline pipeline BertForTokenClassification from Tevfik-istanbullu +author: John Snow Labs +name: camelbert_ner_arabic_pipeline +date: 2024-12-16 +tags: [ar, open_source, pipeline, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`camelbert_ner_arabic_pipeline` is a Arabic model originally trained by Tevfik-istanbullu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/camelbert_ner_arabic_pipeline_ar_5.5.1_3.0_1734337439617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/camelbert_ner_arabic_pipeline_ar_5.5.1_3.0_1734337439617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("camelbert_ner_arabic_pipeline", lang = "ar") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("camelbert_ner_arabic_pipeline", lang = "ar") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|camelbert_ner_arabic_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/Tevfik-istanbullu/camelbert-ner-arabic + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_en.md b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_en.md new file mode 100644 index 00000000000000..d073361613cd9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs SwinForImageClassification from sai17 +author: John Snow Labs +name: cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs` is a English model originally trained by sai17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_en_5.5.1_3.0_1734324937442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_en_5.5.1_3.0_1734324937442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/sai17/cards_bottom_left_swin-tiny-patch4-window7-224-finetuned-dough_100_epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline_en.md new file mode 100644 index 00000000000000..d7ff52d141e660 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline pipeline SwinForImageClassification from sai17 +author: John Snow Labs +name: cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline` is a English model originally trained by sai17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline_en_5.5.1_3.0_1734324948156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline_en_5.5.1_3.0_1734324948156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_dough_100_epochs_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/sai17/cards_bottom_left_swin-tiny-patch4-window7-224-finetuned-dough_100_epochs + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_en.md b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_en.md new file mode 100644 index 00000000000000..9751a325345c46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data SwinForImageClassification from sai17 +author: John Snow Labs +name: cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data` is a English model originally trained by sai17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_en_5.5.1_3.0_1734325275587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_en_5.5.1_3.0_1734325275587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/sai17/cards_bottom_left_swin-tiny-patch4-window7-224-finetuned-v2_more_Data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline_en.md new file mode 100644 index 00000000000000..0c2a916645cb72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline pipeline SwinForImageClassification from sai17 +author: John Snow Labs +name: cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline` is a English model originally trained by sai17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline_en_5.5.1_3.0_1734325286188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline_en_5.5.1_3.0_1734325286188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cards_bottom_left_swin_tiny_patch4_window7_224_finetuned_v2_more_data_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/sai17/cards_bottom_left_swin-tiny-patch4-window7-224-finetuned-v2_more_Data + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-chinese_roberta_wwm_ext_finetuned_en.md b/docs/_posts/ahmedlone127/2024-12-16-chinese_roberta_wwm_ext_finetuned_en.md new file mode 100644 index 00000000000000..7756ba50821b7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-chinese_roberta_wwm_ext_finetuned_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_finetuned BertForQuestionAnswering from DaydreamerF +author: John Snow Labs +name: chinese_roberta_wwm_ext_finetuned +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_finetuned` is a English model originally trained by DaydreamerF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_finetuned_en_5.5.1_3.0_1734339381954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_finetuned_en_5.5.1_3.0_1734339381954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("chinese_roberta_wwm_ext_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("chinese_roberta_wwm_ext_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_finetuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/DaydreamerF/chinese-roberta-wwm-ext-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cipher2_en.md b/docs/_posts/ahmedlone127/2024-12-16-cipher2_en.md new file mode 100644 index 00000000000000..e498dd5e43c3da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cipher2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English cipher2 T5Transformer from suayptalha +author: John Snow Labs +name: cipher2 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cipher2` is a English model originally trained by suayptalha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cipher2_en_5.5.1_3.0_1734330930881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cipher2_en_5.5.1_3.0_1734330930881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("cipher2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("cipher2", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cipher2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/suayptalha/cipher2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cipher2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cipher2_pipeline_en.md new file mode 100644 index 00000000000000..bb298dd5feedf0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cipher2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cipher2_pipeline pipeline T5Transformer from suayptalha +author: John Snow Labs +name: cipher2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cipher2_pipeline` is a English model originally trained by suayptalha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cipher2_pipeline_en_5.5.1_3.0_1734330981633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cipher2_pipeline_en_5.5.1_3.0_1734330981633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cipher2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cipher2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cipher2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/suayptalha/cipher2 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-citation_parser_entity_en.md b/docs/_posts/ahmedlone127/2024-12-16-citation_parser_entity_en.md new file mode 100644 index 00000000000000..b8bf84de4d1b4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-citation_parser_entity_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English citation_parser_entity DistilBertForTokenClassification from SIRIS-Lab +author: John Snow Labs +name: citation_parser_entity +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`citation_parser_entity` is a English model originally trained by SIRIS-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/citation_parser_entity_en_5.5.1_3.0_1734310086760.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/citation_parser_entity_en_5.5.1_3.0_1734310086760.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("citation_parser_entity","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("citation_parser_entity", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|citation_parser_entity| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.5 MB| + +## References + +https://huggingface.co/SIRIS-Lab/citation-parser-ENTITY \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-citation_parser_entity_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-citation_parser_entity_pipeline_en.md new file mode 100644 index 00000000000000..52c275db2208d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-citation_parser_entity_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English citation_parser_entity_pipeline pipeline DistilBertForTokenClassification from SIRIS-Lab +author: John Snow Labs +name: citation_parser_entity_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`citation_parser_entity_pipeline` is a English model originally trained by SIRIS-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/citation_parser_entity_pipeline_en_5.5.1_3.0_1734310112101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/citation_parser_entity_pipeline_en_5.5.1_3.0_1734310112101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("citation_parser_entity_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("citation_parser_entity_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|citation_parser_entity_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|505.5 MB| + +## References + +https://huggingface.co/SIRIS-Lab/citation-parser-ENTITY + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_en.md b/docs/_posts/ahmedlone127/2024-12-16-cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_en.md new file mode 100644 index 00000000000000..6b033e0d064419 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347 DistilBertForTokenClassification from Somisetty2347 +author: John Snow Labs +name: cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347` is a English model originally trained by Somisetty2347. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_en_5.5.1_3.0_1734310043808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_en_5.5.1_3.0_1734310043808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/Somisetty2347/CLEANEDSOMISETTY_IUCN_ANPU_SSN_IBAN_DATE_IMEI2347 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline_en.md new file mode 100644 index 00000000000000..a3d410c51036ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline pipeline DistilBertForTokenClassification from Somisetty2347 +author: John Snow Labs +name: cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline` is a English model originally trained by Somisetty2347. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline_en_5.5.1_3.0_1734310057384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline_en_5.5.1_3.0_1734310057384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cleanedsomisetty_iucn_anpu_ssn_iban_date_imei2347_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/Somisetty2347/CLEANEDSOMISETTY_IUCN_ANPU_SSN_IBAN_DATE_IMEI2347 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-clinicalbert_bionlp13cg_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-clinicalbert_bionlp13cg_ner_en.md new file mode 100644 index 00000000000000..3357916f5952cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-clinicalbert_bionlp13cg_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English clinicalbert_bionlp13cg_ner DistilBertForTokenClassification from judithrosell +author: John Snow Labs +name: clinicalbert_bionlp13cg_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalbert_bionlp13cg_ner` is a English model originally trained by judithrosell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalbert_bionlp13cg_ner_en_5.5.1_3.0_1734310807831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalbert_bionlp13cg_ner_en_5.5.1_3.0_1734310807831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("clinicalbert_bionlp13cg_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("clinicalbert_bionlp13cg_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalbert_bionlp13cg_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/judithrosell/ClinicalBERT_BioNLP13CG_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-clinicalbert_bionlp13cg_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-clinicalbert_bionlp13cg_ner_pipeline_en.md new file mode 100644 index 00000000000000..b75c7399606a09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-clinicalbert_bionlp13cg_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English clinicalbert_bionlp13cg_ner_pipeline pipeline DistilBertForTokenClassification from judithrosell +author: John Snow Labs +name: clinicalbert_bionlp13cg_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalbert_bionlp13cg_ner_pipeline` is a English model originally trained by judithrosell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalbert_bionlp13cg_ner_pipeline_en_5.5.1_3.0_1734310833519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalbert_bionlp13cg_ner_pipeline_en_5.5.1_3.0_1734310833519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("clinicalbert_bionlp13cg_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("clinicalbert_bionlp13cg_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalbert_bionlp13cg_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|505.5 MB| + +## References + +https://huggingface.co/judithrosell/ClinicalBERT_BioNLP13CG_NER + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-clip_en.md b/docs/_posts/ahmedlone127/2024-12-16-clip_en.md new file mode 100644 index 00000000000000..81c7f330c6e071 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-clip_en.md @@ -0,0 +1,120 @@ +--- +layout: model +title: English clip CLIPForZeroShotClassification from yuchenxie +author: John Snow Labs +name: clip +date: 2024-12-16 +tags: [en, open_source, onnx, zero_shot, clip, image] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CLIPForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CLIPForZeroShotClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clip` is a English model originally trained by yuchenxie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clip_en_5.5.1_3.0_1734315951423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clip_en_5.5.1_3.0_1734315951423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +imageDF = spark.read \ + .format("image") \ + .option("dropInvalid", value = True) \ + .load("src/test/resources/image/") + +candidateLabels = [ + "a photo of a bird", + "a photo of a cat", + "a photo of a dog", + "a photo of a hen", + "a photo of a hippo", + "a photo of a room", + "a photo of a tractor", + "a photo of an ostrich", + "a photo of an ox"] + +ImageAssembler = ImageAssembler() \ + .setInputCol("image") \ + .setOutputCol("image_assembler") + +imageClassifier = CLIPForZeroShotClassification.pretrained("clip","en") \ + .setInputCols(["image_assembler"]) \ + .setOutputCol("label") \ + .setCandidateLabels(candidateLabels) + +pipeline = Pipeline().setStages([ImageAssembler, imageClassifier]) +pipelineModel = pipeline.fit(imageDF) +pipelineDF = pipelineModel.transform(imageDF) + + +``` +```scala + + +val imageDF = ResourceHelper.spark.read + .format("image") + .option("dropInvalid", value = true) + .load("src/test/resources/image/") + +val candidateLabels = Array( + "a photo of a bird", + "a photo of a cat", + "a photo of a dog", + "a photo of a hen", + "a photo of a hippo", + "a photo of a room", + "a photo of a tractor", + "a photo of an ostrich", + "a photo of an ox") + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = CLIPForZeroShotClassification.pretrained("clip","en") \ + .setInputCols(Array("image_assembler")) \ + .setOutputCol("label") \ + .setCandidateLabels(candidateLabels) + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) +val pipelineModel = pipeline.fit(imageDF) +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clip| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|1.1 GB| + +## References + +https://huggingface.co/yuchenxie/CLiP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-clip_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-clip_pipeline_en.md new file mode 100644 index 00000000000000..32ff608c87c61a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-clip_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English clip_pipeline pipeline CLIPForZeroShotClassification from yuchenxie +author: John Snow Labs +name: clip_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained CLIPForZeroShotClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clip_pipeline` is a English model originally trained by yuchenxie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clip_pipeline_en_5.5.1_3.0_1734316209025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clip_pipeline_en_5.5.1_3.0_1734316209025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("clip_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("clip_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clip_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.1 GB| + +## References + +https://huggingface.co/yuchenxie/CLiP + +## Included Models + +- ImageAssembler +- CLIPForZeroShotClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cnec_2_0_supertypes_slavicbert_en.md b/docs/_posts/ahmedlone127/2024-12-16-cnec_2_0_supertypes_slavicbert_en.md new file mode 100644 index 00000000000000..a0c68c1221550a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cnec_2_0_supertypes_slavicbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cnec_2_0_supertypes_slavicbert BertForTokenClassification from stulcrad +author: John Snow Labs +name: cnec_2_0_supertypes_slavicbert +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cnec_2_0_supertypes_slavicbert` is a English model originally trained by stulcrad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cnec_2_0_supertypes_slavicbert_en_5.5.1_3.0_1734337583220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cnec_2_0_supertypes_slavicbert_en_5.5.1_3.0_1734337583220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("cnec_2_0_supertypes_slavicbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("cnec_2_0_supertypes_slavicbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cnec_2_0_supertypes_slavicbert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.6 MB| + +## References + +https://huggingface.co/stulcrad/CNEC_2_0_Supertypes_slavicbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cnec_2_0_supertypes_slavicbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cnec_2_0_supertypes_slavicbert_pipeline_en.md new file mode 100644 index 00000000000000..9abe9238ccf9f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cnec_2_0_supertypes_slavicbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cnec_2_0_supertypes_slavicbert_pipeline pipeline BertForTokenClassification from stulcrad +author: John Snow Labs +name: cnec_2_0_supertypes_slavicbert_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cnec_2_0_supertypes_slavicbert_pipeline` is a English model originally trained by stulcrad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cnec_2_0_supertypes_slavicbert_pipeline_en_5.5.1_3.0_1734337616271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cnec_2_0_supertypes_slavicbert_pipeline_en_5.5.1_3.0_1734337616271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cnec_2_0_supertypes_slavicbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cnec_2_0_supertypes_slavicbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cnec_2_0_supertypes_slavicbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|664.6 MB| + +## References + +https://huggingface.co/stulcrad/CNEC_2_0_Supertypes_slavicbert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cnn_news_summary_model_trained_on_reduced_data_zuru7_en.md b/docs/_posts/ahmedlone127/2024-12-16-cnn_news_summary_model_trained_on_reduced_data_zuru7_en.md new file mode 100644 index 00000000000000..501bfb583fd365 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cnn_news_summary_model_trained_on_reduced_data_zuru7_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English cnn_news_summary_model_trained_on_reduced_data_zuru7 T5Transformer from Zuru7 +author: John Snow Labs +name: cnn_news_summary_model_trained_on_reduced_data_zuru7 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cnn_news_summary_model_trained_on_reduced_data_zuru7` is a English model originally trained by Zuru7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_zuru7_en_5.5.1_3.0_1734329937897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_zuru7_en_5.5.1_3.0_1734329937897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("cnn_news_summary_model_trained_on_reduced_data_zuru7","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("cnn_news_summary_model_trained_on_reduced_data_zuru7", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cnn_news_summary_model_trained_on_reduced_data_zuru7| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|244.1 MB| + +## References + +https://huggingface.co/Zuru7/cnn_news_summary_model_trained_on_reduced_data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline_en.md new file mode 100644 index 00000000000000..dde41e896bc3b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline pipeline T5Transformer from Zuru7 +author: John Snow Labs +name: cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline` is a English model originally trained by Zuru7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline_en_5.5.1_3.0_1734329972264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline_en_5.5.1_3.0_1734329972264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cnn_news_summary_model_trained_on_reduced_data_zuru7_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|244.1 MB| + +## References + +https://huggingface.co/Zuru7/cnn_news_summary_model_trained_on_reduced_data + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cs505_nercoqe_xlm_object_en.md b/docs/_posts/ahmedlone127/2024-12-16-cs505_nercoqe_xlm_object_en.md new file mode 100644 index 00000000000000..538d3af93d6082 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cs505_nercoqe_xlm_object_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English cs505_nercoqe_xlm_object XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: cs505_nercoqe_xlm_object +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cs505_nercoqe_xlm_object` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cs505_nercoqe_xlm_object_en_5.5.1_3.0_1734321492850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cs505_nercoqe_xlm_object_en_5.5.1_3.0_1734321492850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("cs505_nercoqe_xlm_object","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("cs505_nercoqe_xlm_object", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cs505_nercoqe_xlm_object| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|770.2 MB| + +## References + +https://huggingface.co/ThuyNT03/CS505-NerCOQE-xlm-Object \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-cs505_nercoqe_xlm_object_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-cs505_nercoqe_xlm_object_pipeline_en.md new file mode 100644 index 00000000000000..4586ce095cff8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-cs505_nercoqe_xlm_object_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English cs505_nercoqe_xlm_object_pipeline pipeline XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: cs505_nercoqe_xlm_object_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cs505_nercoqe_xlm_object_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cs505_nercoqe_xlm_object_pipeline_en_5.5.1_3.0_1734321633942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cs505_nercoqe_xlm_object_pipeline_en_5.5.1_3.0_1734321633942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("cs505_nercoqe_xlm_object_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("cs505_nercoqe_xlm_object_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cs505_nercoqe_xlm_object_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|770.2 MB| + +## References + +https://huggingface.co/ThuyNT03/CS505-NerCOQE-xlm-Object + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_en.md new file mode 100644 index 00000000000000..1a515307eea5f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model T5Transformer from dansul +author: John Snow Labs +name: datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model` is a English model originally trained by dansul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_en_5.5.1_3.0_1734327292223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_en_5.5.1_3.0_1734327292223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|884.3 MB| + +## References + +https://huggingface.co/dansul/datadreamer-dev-abstracts_to_tweet_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline_en.md new file mode 100644 index 00000000000000..50dda5400dd15a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline pipeline T5Transformer from dansul +author: John Snow Labs +name: datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline` is a English model originally trained by dansul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline_en_5.5.1_3.0_1734327383113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline_en_5.5.1_3.0_1734327383113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|datadreamer_dev_abstracts_tonga_tonga_islands_tweet_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|884.3 MB| + +## References + +https://huggingface.co/dansul/datadreamer-dev-abstracts_to_tweet_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-datasnipper_finerdistilbert_en.md b/docs/_posts/ahmedlone127/2024-12-16-datasnipper_finerdistilbert_en.md new file mode 100644 index 00000000000000..ab6ccaa29dc6b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-datasnipper_finerdistilbert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English datasnipper_finerdistilbert DistilBertForTokenClassification from gvisser +author: John Snow Labs +name: datasnipper_finerdistilbert +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`datasnipper_finerdistilbert` is a English model originally trained by gvisser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/datasnipper_finerdistilbert_en_5.5.1_3.0_1734310599367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/datasnipper_finerdistilbert_en_5.5.1_3.0_1734310599367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("datasnipper_finerdistilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("datasnipper_finerdistilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|datasnipper_finerdistilbert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|248.1 MB| + +## References + +https://huggingface.co/gvisser/DataSnipper_FinerDistilBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-datasnipper_finerdistilbert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-datasnipper_finerdistilbert_pipeline_en.md new file mode 100644 index 00000000000000..8e5af24783f6ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-datasnipper_finerdistilbert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English datasnipper_finerdistilbert_pipeline pipeline DistilBertForTokenClassification from gvisser +author: John Snow Labs +name: datasnipper_finerdistilbert_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`datasnipper_finerdistilbert_pipeline` is a English model originally trained by gvisser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/datasnipper_finerdistilbert_pipeline_en_5.5.1_3.0_1734310613041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/datasnipper_finerdistilbert_pipeline_en_5.5.1_3.0_1734310613041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("datasnipper_finerdistilbert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("datasnipper_finerdistilbert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|datasnipper_finerdistilbert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|248.1 MB| + +## References + +https://huggingface.co/gvisser/DataSnipper_FinerDistilBert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_2nd_dec_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_2nd_dec_en.md new file mode 100644 index 00000000000000..d9f4612268db0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_2nd_dec_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_2nd_dec DeBertaForSequenceClassification from ajinkya-ftpl +author: John Snow Labs +name: deberta_2nd_dec +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_2nd_dec` is a English model originally trained by ajinkya-ftpl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_2nd_dec_en_5.5.1_3.0_1734312083415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_2nd_dec_en_5.5.1_3.0_1734312083415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_2nd_dec","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_2nd_dec", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_2nd_dec| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/ajinkya-ftpl/deberta_2nd_dec \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_2nd_dec_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_2nd_dec_pipeline_en.md new file mode 100644 index 00000000000000..5ce117e22a5853 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_2nd_dec_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_2nd_dec_pipeline pipeline DeBertaForSequenceClassification from ajinkya-ftpl +author: John Snow Labs +name: deberta_2nd_dec_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_2nd_dec_pipeline` is a English model originally trained by ajinkya-ftpl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_2nd_dec_pipeline_en_5.5.1_3.0_1734312219921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_2nd_dec_pipeline_en_5.5.1_3.0_1734312219921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_2nd_dec_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_2nd_dec_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_2nd_dec_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/ajinkya-ftpl/deberta_2nd_dec + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_large_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_large_en.md new file mode 100644 index 00000000000000..71c37de972f067 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_large_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_large DeBertaForSequenceClassification from jhonalevc1995 +author: John Snow Labs +name: deberta_large +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_large` is a English model originally trained by jhonalevc1995. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_large_en_5.5.1_3.0_1734311811239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_large_en_5.5.1_3.0_1734311811239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_large| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/jhonalevc1995/deberta_large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_large_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_large_pipeline_en.md new file mode 100644 index 00000000000000..9513a98a150778 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_large_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_large_pipeline pipeline DeBertaForSequenceClassification from jhonalevc1995 +author: John Snow Labs +name: deberta_large_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_large_pipeline` is a English model originally trained by jhonalevc1995. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_large_pipeline_en_5.5.1_3.0_1734311938051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_large_pipeline_en_5.5.1_3.0_1734311938051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_large_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_large_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_large_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/jhonalevc1995/deberta_large + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_reward_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_reward_model_en.md new file mode 100644 index 00000000000000..93099c66ee214d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_reward_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_base_reward_model DeBertaForSequenceClassification from JustDoItNow +author: John Snow Labs +name: deberta_v3_base_reward_model +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_reward_model` is a English model originally trained by JustDoItNow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_reward_model_en_5.5.1_3.0_1734312226032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_reward_model_en_5.5.1_3.0_1734312226032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_reward_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_reward_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_reward_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|557.2 MB| + +## References + +https://huggingface.co/JustDoItNow/deberta-v3-base-reward-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_reward_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_reward_model_pipeline_en.md new file mode 100644 index 00000000000000..3bc1586efce2b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_reward_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_base_reward_model_pipeline pipeline DeBertaForSequenceClassification from JustDoItNow +author: John Snow Labs +name: deberta_v3_base_reward_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_reward_model_pipeline` is a English model originally trained by JustDoItNow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_reward_model_pipeline_en_5.5.1_3.0_1734312312963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_reward_model_pipeline_en_5.5.1_3.0_1734312312963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_base_reward_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_base_reward_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_reward_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|557.2 MB| + +## References + +https://huggingface.co/JustDoItNow/deberta-v3-base-reward-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_tuned_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_tuned_en.md new file mode 100644 index 00000000000000..02a492b1bc61f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_tuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_base_tuned DeBertaForSequenceClassification from gnurt2041 +author: John Snow Labs +name: deberta_v3_base_tuned +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_tuned` is a English model originally trained by gnurt2041. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_tuned_en_5.5.1_3.0_1734311514348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_tuned_en_5.5.1_3.0_1734311514348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_tuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|585.6 MB| + +## References + +https://huggingface.co/gnurt2041/deberta-v3-base-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_tuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_tuned_pipeline_en.md new file mode 100644 index 00000000000000..75ecc61e50443e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_tuned_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_base_tuned_pipeline pipeline DeBertaForSequenceClassification from gnurt2041 +author: John Snow Labs +name: deberta_v3_base_tuned_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_tuned_pipeline` is a English model originally trained by gnurt2041. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_tuned_pipeline_en_5.5.1_3.0_1734311574804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_tuned_pipeline_en_5.5.1_3.0_1734311574804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_base_tuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_base_tuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_tuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|585.6 MB| + +## References + +https://huggingface.co/gnurt2041/deberta-v3-base-tuned + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_en.md new file mode 100644 index 00000000000000..4031ec2fff3bf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_base_zyda_2 DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_base_zyda_2 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_zyda_2` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_en_5.5.1_3.0_1734312631767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_en_5.5.1_3.0_1734312631767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_zyda_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_zyda_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_zyda_2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|692.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-base-zyda-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_pipeline_en.md new file mode 100644 index 00000000000000..2fc6a69b59527a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_base_zyda_2_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_base_zyda_2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_zyda_2_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_pipeline_en_5.5.1_3.0_1734312669615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_pipeline_en_5.5.1_3.0_1734312669615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_base_zyda_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_base_zyda_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_zyda_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|692.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-base-zyda-2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_quality_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_quality_en.md new file mode 100644 index 00000000000000..944cf3aad4ec62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_quality_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_base_zyda_2_quality DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_base_zyda_2_quality +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_zyda_2_quality` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_quality_en_5.5.1_3.0_1734312155032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_quality_en_5.5.1_3.0_1734312155032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_zyda_2_quality","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_zyda_2_quality", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_zyda_2_quality| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|692.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-base-zyda-2-quality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_quality_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_quality_pipeline_en.md new file mode 100644 index 00000000000000..abf9b8a026086d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_quality_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_base_zyda_2_quality_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_base_zyda_2_quality_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_zyda_2_quality_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_quality_pipeline_en_5.5.1_3.0_1734312189895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_quality_pipeline_en_5.5.1_3.0_1734312189895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_base_zyda_2_quality_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_base_zyda_2_quality_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_zyda_2_quality_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|692.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-base-zyda-2-quality + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_sentiment_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_sentiment_en.md new file mode 100644 index 00000000000000..20d434635ed1df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_sentiment_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_base_zyda_2_sentiment DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_base_zyda_2_sentiment +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_zyda_2_sentiment` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_sentiment_en_5.5.1_3.0_1734311810527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_sentiment_en_5.5.1_3.0_1734311810527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_zyda_2_sentiment","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_base_zyda_2_sentiment", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_zyda_2_sentiment| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|692.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-base-zyda-2-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_sentiment_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_sentiment_pipeline_en.md new file mode 100644 index 00000000000000..f280f69d47f267 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_base_zyda_2_sentiment_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_base_zyda_2_sentiment_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_base_zyda_2_sentiment_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_base_zyda_2_sentiment_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_sentiment_pipeline_en_5.5.1_3.0_1734311852097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_base_zyda_2_sentiment_pipeline_en_5.5.1_3.0_1734311852097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_base_zyda_2_sentiment_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_base_zyda_2_sentiment_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_base_zyda_2_sentiment_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|692.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-base-zyda-2-sentiment + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_large_survey_fluency_rater_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_large_survey_fluency_rater_en.md new file mode 100644 index 00000000000000..f22333ff758710 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_large_survey_fluency_rater_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_large_survey_fluency_rater DeBertaForSequenceClassification from domenicrosati +author: John Snow Labs +name: deberta_v3_large_survey_fluency_rater +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_large_survey_fluency_rater` is a English model originally trained by domenicrosati. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_fluency_rater_en_5.5.1_3.0_1734312790490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_fluency_rater_en_5.5.1_3.0_1734312790490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_large_survey_fluency_rater","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_large_survey_fluency_rater", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_large_survey_fluency_rater| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/domenicrosati/deberta-v3-large-survey-fluency-rater \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_large_survey_fluency_rater_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_large_survey_fluency_rater_pipeline_en.md new file mode 100644 index 00000000000000..618ce21ff82e71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_large_survey_fluency_rater_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_large_survey_fluency_rater_pipeline pipeline DeBertaForSequenceClassification from domenicrosati +author: John Snow Labs +name: deberta_v3_large_survey_fluency_rater_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_large_survey_fluency_rater_pipeline` is a English model originally trained by domenicrosati. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_fluency_rater_pipeline_en_5.5.1_3.0_1734312910652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_fluency_rater_pipeline_en_5.5.1_3.0_1734312910652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_large_survey_fluency_rater_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_large_survey_fluency_rater_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_large_survey_fluency_rater_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/domenicrosati/deberta-v3-large-survey-fluency-rater + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_quality_v2_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_quality_v2_en.md new file mode 100644 index 00000000000000..43547d8eeef483 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_quality_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_xsmall_quality_v2 DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_quality_v2 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_quality_v2` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_quality_v2_en_5.5.1_3.0_1734311457295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_quality_v2_en_5.5.1_3.0_1734311457295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_quality_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_quality_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_quality_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-quality-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_quality_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_quality_v2_pipeline_en.md new file mode 100644 index 00000000000000..6c2d7dca62bc09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_quality_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_xsmall_quality_v2_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_quality_v2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_quality_v2_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_quality_v2_pipeline_en_5.5.1_3.0_1734311476864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_quality_v2_pipeline_en_5.5.1_3.0_1734311476864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_xsmall_quality_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_xsmall_quality_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_quality_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-quality-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_en.md new file mode 100644 index 00000000000000..130674edaf4adb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_xsmall_readability DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_readability +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_readability` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_en_5.5.1_3.0_1734311451362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_en_5.5.1_3.0_1734311451362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_readability","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_readability", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_readability| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|266.1 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-readability \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_pipeline_en.md new file mode 100644 index 00000000000000..d6eda53ebc38cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_xsmall_readability_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_readability_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_readability_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_pipeline_en_5.5.1_3.0_1734311465609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_pipeline_en_5.5.1_3.0_1734311465609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_xsmall_readability_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_xsmall_readability_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_readability_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|266.2 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-readability + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_v2_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_v2_en.md new file mode 100644 index 00000000000000..52eccc556d2c27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_xsmall_readability_v2 DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_readability_v2 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_readability_v2` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_v2_en_5.5.1_3.0_1734313470151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_v2_en_5.5.1_3.0_1734313470151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_readability_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_readability_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_readability_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|259.6 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-readability-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_v2_pipeline_en.md new file mode 100644 index 00000000000000..67ce3af04e9465 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_readability_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_xsmall_readability_v2_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_readability_v2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_readability_v2_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_v2_pipeline_en_5.5.1_3.0_1734313485355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_readability_v2_pipeline_en_5.5.1_3.0_1734313485355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_xsmall_readability_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_xsmall_readability_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_readability_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|259.6 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-readability-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_zyda_2_quality_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_zyda_2_quality_en.md new file mode 100644 index 00000000000000..11307de702d23b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_zyda_2_quality_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English deberta_v3_xsmall_zyda_2_quality DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_zyda_2_quality +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_zyda_2_quality` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_zyda_2_quality_en_5.5.1_3.0_1734313489902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_zyda_2_quality_en_5.5.1_3.0_1734313489902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_zyda_2_quality","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_xsmall_zyda_2_quality", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_zyda_2_quality| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|266.2 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2-quality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_zyda_2_quality_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_zyda_2_quality_pipeline_en.md new file mode 100644 index 00000000000000..afe88ba0ecaf37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-deberta_v3_xsmall_zyda_2_quality_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English deberta_v3_xsmall_zyda_2_quality_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: deberta_v3_xsmall_zyda_2_quality_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_xsmall_zyda_2_quality_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_zyda_2_quality_pipeline_en_5.5.1_3.0_1734313503735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_xsmall_zyda_2_quality_pipeline_en_5.5.1_3.0_1734313503735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("deberta_v3_xsmall_zyda_2_quality_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("deberta_v3_xsmall_zyda_2_quality_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deberta_v3_xsmall_zyda_2_quality_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|266.2 MB| + +## References + +https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2-quality + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-delivery_truck_classification_en.md b/docs/_posts/ahmedlone127/2024-12-16-delivery_truck_classification_en.md new file mode 100644 index 00000000000000..49029d500818da --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-delivery_truck_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English delivery_truck_classification SwinForImageClassification from JEdward7777 +author: John Snow Labs +name: delivery_truck_classification +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`delivery_truck_classification` is a English model originally trained by JEdward7777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/delivery_truck_classification_en_5.5.1_3.0_1734325007985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/delivery_truck_classification_en_5.5.1_3.0_1734325007985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""delivery_truck_classification","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("delivery_truck_classification","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|delivery_truck_classification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/JEdward7777/delivery_truck_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-delivery_truck_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-delivery_truck_classification_pipeline_en.md new file mode 100644 index 00000000000000..f5cd1e58e45c39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-delivery_truck_classification_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English delivery_truck_classification_pipeline pipeline SwinForImageClassification from JEdward7777 +author: John Snow Labs +name: delivery_truck_classification_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`delivery_truck_classification_pipeline` is a English model originally trained by JEdward7777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/delivery_truck_classification_pipeline_en_5.5.1_3.0_1734325018756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/delivery_truck_classification_pipeline_en_5.5.1_3.0_1734325018756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("delivery_truck_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("delivery_truck_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|delivery_truck_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/JEdward7777/delivery_truck_classification + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-demotest_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-demotest_pipeline_en.md new file mode 100644 index 00000000000000..140ba5b141dcb9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-demotest_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English demotest_pipeline pipeline VisionEncoderDecoderForImageCaptioning from Tomatolovve +author: John Snow Labs +name: demotest_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demotest_pipeline` is a English model originally trained by Tomatolovve. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demotest_pipeline_en_5.5.1_3.0_1734318842103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demotest_pipeline_en_5.5.1_3.0_1734318842103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("demotest_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("demotest_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demotest_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Tomatolovve/DemoTest + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distil_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-distil_bert_finetuned_ner_en.md new file mode 100644 index 00000000000000..cad50c213557ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distil_bert_finetuned_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distil_bert_finetuned_ner DistilBertForTokenClassification from timmyAlvice +author: John Snow Labs +name: distil_bert_finetuned_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_finetuned_ner` is a English model originally trained by timmyAlvice. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_finetuned_ner_en_5.5.1_3.0_1734310830602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_finetuned_ner_en_5.5.1_3.0_1734310830602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distil_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distil_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/timmyAlvice/distil-bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distil_bert_finetuned_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distil_bert_finetuned_ner_pipeline_en.md new file mode 100644 index 00000000000000..fbb1e1cfabb2bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distil_bert_finetuned_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distil_bert_finetuned_ner_pipeline pipeline DistilBertForTokenClassification from timmyAlvice +author: John Snow Labs +name: distil_bert_finetuned_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_finetuned_ner_pipeline` is a English model originally trained by timmyAlvice. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_finetuned_ner_pipeline_en_5.5.1_3.0_1734310844103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_finetuned_ner_pipeline_en_5.5.1_3.0_1734310844103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distil_bert_finetuned_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distil_bert_finetuned_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_finetuned_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/timmyAlvice/distil-bert-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distil_train_token_classification_en.md b/docs/_posts/ahmedlone127/2024-12-16-distil_train_token_classification_en.md new file mode 100644 index 00000000000000..4deda0027fb250 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distil_train_token_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distil_train_token_classification DistilBertForTokenClassification from ymgong +author: John Snow Labs +name: distil_train_token_classification +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_train_token_classification` is a English model originally trained by ymgong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_train_token_classification_en_5.5.1_3.0_1734310952526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_train_token_classification_en_5.5.1_3.0_1734310952526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distil_train_token_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distil_train_token_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_train_token_classification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ymgong/distil_train_token_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distil_train_token_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distil_train_token_classification_pipeline_en.md new file mode 100644 index 00000000000000..15d99fbd1f41d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distil_train_token_classification_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distil_train_token_classification_pipeline pipeline DistilBertForTokenClassification from ymgong +author: John Snow Labs +name: distil_train_token_classification_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_train_token_classification_pipeline` is a English model originally trained by ymgong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_train_token_classification_pipeline_en_5.5.1_3.0_1734310965433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_train_token_classification_pipeline_en_5.5.1_3.0_1734310965433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distil_train_token_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distil_train_token_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_train_token_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ymgong/distil_train_token_classification + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_en.md new file mode 100644 index 00000000000000..fda5c2338ea0b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner DistilBertForTokenClassification from Anson212 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner` is a English model originally trained by Anson212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_en_5.5.1_3.0_1734310454524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_en_5.5.1_3.0_1734310454524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Anson212/distilbert-base-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_eng_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_eng_en.md new file mode 100644 index 00000000000000..478d131c478168 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_eng_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_eng DistilBertForTokenClassification from LukeGPT88 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_eng +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_eng` is a English model originally trained by LukeGPT88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_eng_en_5.5.1_3.0_1734310035317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_eng_en_5.5.1_3.0_1734310035317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_eng","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_eng", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_eng| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/LukeGPT88/distilbert-base-cased-finetuned-ner-eng \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_eng_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_eng_pipeline_en.md new file mode 100644 index 00000000000000..0db644fdb5bba4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_eng_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_eng_pipeline pipeline DistilBertForTokenClassification from LukeGPT88 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_eng_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_eng_pipeline` is a English model originally trained by LukeGPT88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_eng_pipeline_en_5.5.1_3.0_1734310047820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_eng_pipeline_en_5.5.1_3.0_1734310047820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_cased_finetuned_ner_eng_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_cased_finetuned_ner_eng_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_eng_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/LukeGPT88/distilbert-base-cased-finetuned-ner-eng + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_pipeline_en.md new file mode 100644 index 00000000000000..65c423383a1920 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_cased_finetuned_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_pipeline pipeline DistilBertForTokenClassification from Anson212 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_pipeline` is a English model originally trained by Anson212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_pipeline_en_5.5.1_3.0_1734310467013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_pipeline_en_5.5.1_3.0_1734310467013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_cased_finetuned_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_cased_finetuned_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Anson212/distilbert-base-cased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_eng_cased_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_eng_cased_ner_en.md new file mode 100644 index 00000000000000..087254317578a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_eng_cased_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_eng_cased_ner DistilBertForTokenClassification from LukeGPT88 +author: John Snow Labs +name: distilbert_base_eng_cased_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_eng_cased_ner` is a English model originally trained by LukeGPT88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_eng_cased_ner_en_5.5.1_3.0_1734310521988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_eng_cased_ner_en_5.5.1_3.0_1734310521988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_eng_cased_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_eng_cased_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_eng_cased_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/LukeGPT88/distilbert-base-eng-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_eng_cased_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_eng_cased_ner_pipeline_en.md new file mode 100644 index 00000000000000..a6653784210e4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_eng_cased_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_eng_cased_ner_pipeline pipeline DistilBertForTokenClassification from LukeGPT88 +author: John Snow Labs +name: distilbert_base_eng_cased_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_eng_cased_ner_pipeline` is a English model originally trained by LukeGPT88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_eng_cased_ner_pipeline_en_5.5.1_3.0_1734310534239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_eng_cased_ner_pipeline_en_5.5.1_3.0_1734310534239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_eng_cased_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_eng_cased_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_eng_cased_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/LukeGPT88/distilbert-base-eng-cased-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_multilingual_cased_finetuned_geordie_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_multilingual_cased_finetuned_geordie_pipeline_xx.md new file mode 100644 index 00000000000000..76fee15191b948 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_multilingual_cased_finetuned_geordie_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_geordie_pipeline pipeline DistilBertForTokenClassification from nicolauduran45 +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_geordie_pipeline +date: 2024-12-16 +tags: [xx, open_source, pipeline, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_geordie_pipeline` is a Multilingual model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_geordie_pipeline_xx_5.5.1_3.0_1734310656926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_geordie_pipeline_xx_5.5.1_3.0_1734310656926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_multilingual_cased_finetuned_geordie_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_multilingual_cased_finetuned_geordie_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_geordie_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/nicolauduran45/distilbert-base-multilingual-cased-finetuned-geordie + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_multilingual_cased_finetuned_geordie_xx.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_multilingual_cased_finetuned_geordie_xx.md new file mode 100644 index 00000000000000..b07d216e826fe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_multilingual_cased_finetuned_geordie_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_geordie DistilBertForTokenClassification from nicolauduran45 +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_geordie +date: 2024-12-16 +tags: [xx, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_geordie` is a Multilingual model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_geordie_xx_5.5.1_3.0_1734310625551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_geordie_xx_5.5.1_3.0_1734310625551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_finetuned_geordie","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_finetuned_geordie", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_geordie| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/nicolauduran45/distilbert-base-multilingual-cased-finetuned-geordie \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_allnli_turkish_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_allnli_turkish_pipeline_tr.md new file mode 100644 index 00000000000000..052f324ff8065b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_allnli_turkish_pipeline_tr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Turkish distilbert_base_turkish_cased_allnli_turkish_pipeline pipeline DistilBertForZeroShotClassification from emrecan +author: John Snow Labs +name: distilbert_base_turkish_cased_allnli_turkish_pipeline +date: 2024-12-16 +tags: [tr, open_source, pipeline, onnx] +task: Zero-Shot Classification +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_allnli_turkish_pipeline` is a Turkish model originally trained by emrecan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_allnli_turkish_pipeline_tr_5.5.1_3.0_1734309576726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_allnli_turkish_pipeline_tr_5.5.1_3.0_1734309576726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_turkish_cased_allnli_turkish_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_turkish_cased_allnli_turkish_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_allnli_turkish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|254.1 MB| + +## References + +https://huggingface.co/emrecan/distilbert-base-turkish-cased-allnli_tr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForZeroShotClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_allnli_turkish_tr.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_allnli_turkish_tr.md new file mode 100644 index 00000000000000..ac97725fba942c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_allnli_turkish_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish distilbert_base_turkish_cased_allnli_turkish DistilBertForZeroShotClassification from emrecan +author: John Snow Labs +name: distilbert_base_turkish_cased_allnli_turkish +date: 2024-12-16 +tags: [tr, open_source, onnx, zero_shot, distilbert] +task: Zero-Shot Classification +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_allnli_turkish` is a Turkish model originally trained by emrecan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_allnli_turkish_tr_5.5.1_3.0_1734309560037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_allnli_turkish_tr_5.5.1_3.0_1734309560037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_turkish_cased_allnli_turkish","tr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, zeroShotClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_turkish_cased_allnli_turkish", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, zeroShotClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_allnli_turkish| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|254.1 MB| + +## References + +https://huggingface.co/emrecan/distilbert-base-turkish-cased-allnli_tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_multinli_turkish_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_multinli_turkish_pipeline_tr.md new file mode 100644 index 00000000000000..78fa4653a11775 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_multinli_turkish_pipeline_tr.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Turkish distilbert_base_turkish_cased_multinli_turkish_pipeline pipeline DistilBertForZeroShotClassification from emrecan +author: John Snow Labs +name: distilbert_base_turkish_cased_multinli_turkish_pipeline +date: 2024-12-16 +tags: [tr, open_source, pipeline, onnx] +task: Zero-Shot Classification +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_multinli_turkish_pipeline` is a Turkish model originally trained by emrecan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_multinli_turkish_pipeline_tr_5.5.1_3.0_1734309574962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_multinli_turkish_pipeline_tr_5.5.1_3.0_1734309574962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_turkish_cased_multinli_turkish_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_turkish_cased_multinli_turkish_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_multinli_turkish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|254.1 MB| + +## References + +https://huggingface.co/emrecan/distilbert-base-turkish-cased-multinli_tr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForZeroShotClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_multinli_turkish_tr.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_multinli_turkish_tr.md new file mode 100644 index 00000000000000..f53f1d13cb3c5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_turkish_cased_multinli_turkish_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish distilbert_base_turkish_cased_multinli_turkish DistilBertForZeroShotClassification from emrecan +author: John Snow Labs +name: distilbert_base_turkish_cased_multinli_turkish +date: 2024-12-16 +tags: [tr, open_source, onnx, zero_shot, distilbert] +task: Zero-Shot Classification +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_multinli_turkish` is a Turkish model originally trained by emrecan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_multinli_turkish_tr_5.5.1_3.0_1734309558622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_multinli_turkish_tr_5.5.1_3.0_1734309558622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_turkish_cased_multinli_turkish","tr") \ + .setInputCols(["document","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, zeroShotClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_turkish_cased_multinli_turkish", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, zeroShotClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_multinli_turkish| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|254.1 MB| + +## References + +https://huggingface.co/emrecan/distilbert-base-turkish-cased-multinli_tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_en.md new file mode 100644 index 00000000000000..0e5bfd6b127259 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: DistilBERT base model (uncased) +author: John Snow Labs +name: distilbert_base_uncased +date: 2024-12-16 +tags: [distilbert, en, english, embeddings, open_source, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This model is a distilled version of the [BERT base model](https://huggingface.co/bert-base-cased). It was introduced in [this paper](https://arxiv.org/abs/1910.01108). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/research_projects/distillation). This model is uncased: it does not make a difference between english and English. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_en_5.5.1_3.0_1734311120051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_en_5.5.1_3.0_1734311120051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased", "en") \ +.setInputCols("sentence", "token") \ +.setOutputCol("embeddings") +nlp_pipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, embeddings]) +``` +```scala +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased", "en") +.setInputCols("sentence", "token") +.setOutputCol("embeddings") +val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, embeddings)) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.embed.distilbert.base.uncased").predict("""Put your text here.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +References + +[https://huggingface.co/distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) + +## Benchmarking + +```bash + +Benchmarking + + +When fine-tuned on downstream tasks, this model achieves the following results: + +Glue test results: + +| Task | MNLI | QQP | QNLI | SST-2 | CoLA | STS-B | MRPC | RTE | +|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:| +| | 82.2 | 88.5 | 89.2 | 91.3 | 51.3 | 85.8 | 87.5 | 59.9 | +``` \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_imdb_rezakakooee_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_imdb_rezakakooee_en.md new file mode 100644 index 00000000000000..041d83a78d6111 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_imdb_rezakakooee_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_rezakakooee DistilBertEmbeddings from Rezakakooee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_rezakakooee +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, distilbert] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_rezakakooee` is a English model originally trained by Rezakakooee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rezakakooee_en_5.5.1_3.0_1734307541397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rezakakooee_en_5.5.1_3.0_1734307541397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_rezakakooee","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val embeddings = DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_imdb_rezakakooee","en") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_rezakakooee| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[distilbert]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Rezakakooee/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline_en.md new file mode 100644 index 00000000000000..69d6674a766fea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline pipeline DistilBertEmbeddings from Rezakakooee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline` is a English model originally trained by Rezakakooee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline_en_5.5.1_3.0_1734307554681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline_en_5.5.1_3.0_1734307554681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_rezakakooee_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Rezakakooee/distilbert-base-uncased-finetuned-imdb + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_alicedh_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_alicedh_en.md new file mode 100644 index 00000000000000..9aa785c9d0b424 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_alicedh_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_alicedh DistilBertForTokenClassification from Alicedh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_alicedh +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_alicedh` is a English model originally trained by Alicedh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_alicedh_en_5.5.1_3.0_1734310513681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_alicedh_en_5.5.1_3.0_1734310513681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_alicedh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_alicedh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_alicedh| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Alicedh/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_alicedh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_alicedh_pipeline_en.md new file mode 100644 index 00000000000000..0f5b03fc52810b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_alicedh_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_alicedh_pipeline pipeline DistilBertForTokenClassification from Alicedh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_alicedh_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_alicedh_pipeline` is a English model originally trained by Alicedh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_alicedh_pipeline_en_5.5.1_3.0_1734310526294.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_alicedh_pipeline_en_5.5.1_3.0_1734310526294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ner_alicedh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ner_alicedh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_alicedh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Alicedh/distilbert-base-uncased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_anson212_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_anson212_en.md new file mode 100644 index 00000000000000..248be6cf1a5bfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_anson212_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_anson212 DistilBertForTokenClassification from Anson212 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_anson212 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_anson212` is a English model originally trained by Anson212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_anson212_en_5.5.1_3.0_1734311017888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_anson212_en_5.5.1_3.0_1734311017888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_anson212","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_anson212", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_anson212| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Anson212/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_anson212_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_anson212_pipeline_en.md new file mode 100644 index 00000000000000..d68b9b4836492e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_anson212_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_anson212_pipeline pipeline DistilBertForTokenClassification from Anson212 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_anson212_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_anson212_pipeline` is a English model originally trained by Anson212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_anson212_pipeline_en_5.5.1_3.0_1734311030519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_anson212_pipeline_en_5.5.1_3.0_1734311030519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ner_anson212_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ner_anson212_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_anson212_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Anson212/distilbert-base-uncased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_cadec_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_cadec_en.md new file mode 100644 index 00000000000000..f67bc75e45b6e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_cadec_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_cadec DistilBertForTokenClassification from csNoHug +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_cadec +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_cadec` is a English model originally trained by csNoHug. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cadec_en_5.5.1_3.0_1734310482712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cadec_en_5.5.1_3.0_1734310482712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_cadec","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_cadec", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_cadec| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/csNoHug/distilbert-base-uncased-finetuned-ner-cadec \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_cadec_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_cadec_pipeline_en.md new file mode 100644 index 00000000000000..b164ad21e0cf67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_cadec_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_cadec_pipeline pipeline DistilBertForTokenClassification from csNoHug +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_cadec_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_cadec_pipeline` is a English model originally trained by csNoHug. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cadec_pipeline_en_5.5.1_3.0_1734310495950.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cadec_pipeline_en_5.5.1_3.0_1734310495950.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ner_cadec_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ner_cadec_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_cadec_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/csNoHug/distilbert-base-uncased-finetuned-ner-cadec + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_edoumazane_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_edoumazane_en.md new file mode 100644 index 00000000000000..82f35be29672ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_edoumazane_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_edoumazane DistilBertForTokenClassification from edoumazane +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_edoumazane +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_edoumazane` is a English model originally trained by edoumazane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edoumazane_en_5.5.1_3.0_1734310305731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edoumazane_en_5.5.1_3.0_1734310305731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_edoumazane","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_edoumazane", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_edoumazane| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/edoumazane/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_edoumazane_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_edoumazane_pipeline_en.md new file mode 100644 index 00000000000000..924228d810b818 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_edoumazane_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_edoumazane_pipeline pipeline DistilBertForTokenClassification from edoumazane +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_edoumazane_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_edoumazane_pipeline` is a English model originally trained by edoumazane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edoumazane_pipeline_en_5.5.1_3.0_1734310318824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edoumazane_pipeline_en_5.5.1_3.0_1734310318824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ner_edoumazane_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ner_edoumazane_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_edoumazane_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/edoumazane/distilbert-base-uncased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_pnr_svc_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_pnr_svc_en.md new file mode 100644 index 00000000000000..b836642c8a0172 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_pnr_svc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pnr_svc DistilBertForTokenClassification from pnr-svc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pnr_svc +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pnr_svc` is a English model originally trained by pnr-svc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pnr_svc_en_5.5.1_3.0_1734310751824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pnr_svc_en_5.5.1_3.0_1734310751824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pnr_svc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pnr_svc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pnr_svc| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|251.9 MB| + +## References + +https://huggingface.co/pnr-svc/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline_en.md new file mode 100644 index 00000000000000..581f7c04b1fdf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline pipeline DistilBertForTokenClassification from pnr-svc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline` is a English model originally trained by pnr-svc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline_en_5.5.1_3.0_1734310764881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline_en_5.5.1_3.0_1734310764881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pnr_svc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|251.9 MB| + +## References + +https://huggingface.co/pnr-svc/distilbert-base-uncased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_prathamnagpure_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_prathamnagpure_en.md new file mode 100644 index 00000000000000..ef5b2b675374ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_prathamnagpure_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_prathamnagpure DistilBertForTokenClassification from prathamnagpure +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_prathamnagpure +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_prathamnagpure` is a English model originally trained by prathamnagpure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_prathamnagpure_en_5.5.1_3.0_1734310225552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_prathamnagpure_en_5.5.1_3.0_1734310225552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_prathamnagpure","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_prathamnagpure", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_prathamnagpure| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prathamnagpure/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline_en.md new file mode 100644 index 00000000000000..2fee45cfdb8cc4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline pipeline DistilBertForTokenClassification from prathamnagpure +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline` is a English model originally trained by prathamnagpure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline_en_5.5.1_3.0_1734310238060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline_en_5.5.1_3.0_1734310238060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_prathamnagpure_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prathamnagpure/distilbert-base-uncased-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..eafaeee33a74ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_uncased_pipeline pipeline DistilBertForTokenClassification from bloomdata +author: John Snow Labs +name: distilbert_base_uncased_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_pipeline` is a English model originally trained by bloomdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pipeline_en_5.5.1_3.0_1734311132403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pipeline_en_5.5.1_3.0_1734311132403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bloomdata/distilbert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_zero_shot_classifier_uncased_mnli_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_zero_shot_classifier_uncased_mnli_en.md new file mode 100644 index 00000000000000..500aa4765260bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_zero_shot_classifier_uncased_mnli_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: DistilBERTZero-Shot Classification Base - MNLI(distilbert_base_zero_shot_classifier_uncased_mnli) +author: John Snow Labs +name: distilbert_base_zero_shot_classifier_uncased_mnli +date: 2024-12-16 +tags: [zero_shot, en, mnli, distilbert, english, base, open_source, openvino, onnx] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This model is intended to be used for zero-shot text classification, especially in English. It is fine-tuned on MNLI by using DistilBERT Base Uncased model. + +DistilBertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of DistilBertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible. + +We used TFDistilBertForSequenceClassification to train this model and used DistilBertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale! + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_en_5.5.1_3.0_1734309637080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_en_5.5.1_3.0_1734309637080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ +.setInputCol('text') \ +.setOutputCol('document') + +tokenizer = Tokenizer() \ +.setInputCols(['document']) \ +.setOutputCol('token') + +zeroShotClassifier = DistilBertForZeroShotClassification \ +.pretrained('distilbert_base_zero_shot_classifier_uncased_mnli', 'en') \ +.setInputCols(['token', 'document']) \ +.setOutputCol('class') \ +.setCaseSensitive(True) \ +.setMaxSentenceLength(512) \ +.setCandidateLabels(["urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"]) + +pipeline = Pipeline(stages=[ +document_assembler, +tokenizer, +zeroShotClassifier +]) + +example = spark.createDataFrame([['I have a problem with my iphone that needs to be resolved asap!!']]).toDF("text") +result = pipeline.fit(example).transform(example) +``` +```scala +val document_assembler = DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val tokenizer = Tokenizer() +.setInputCols("document") +.setOutputCol("token") + +val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_uncased_mnli", "en") +.setInputCols("document", "token") +.setOutputCol("class") +.setCaseSensitive(true) +.setMaxSentenceLength(512) +.setCandidateLabels(Array("urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology")) + +val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) +val example = Seq("I have a problem with my iphone that needs to be resolved asap!!").toDS.toDF("text") +val result = pipeline.fit(example).transform(example) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_zero_shot_classifier_uncased_mnli| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +References + +https://huggingface.co/typeform/distilbert-base-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_zero_shot_classifier_uncased_mnli_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_zero_shot_classifier_uncased_mnli_pipeline_en.md new file mode 100644 index 00000000000000..3020bfa47e4318 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_base_zero_shot_classifier_uncased_mnli_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_base_zero_shot_classifier_uncased_mnli_pipeline pipeline DistilBertForZeroShotClassification from typeform +author: John Snow Labs +name: distilbert_base_zero_shot_classifier_uncased_mnli_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_zero_shot_classifier_uncased_mnli_pipeline` is a English model originally trained by typeform. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_pipeline_en_5.5.1_3.0_1734309650064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_pipeline_en_5.5.1_3.0_1734309650064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_base_zero_shot_classifier_uncased_mnli_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_base_zero_shot_classifier_uncased_mnli_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_zero_shot_classifier_uncased_mnli_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/typeform/distilbert-base-uncased-mnli + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForZeroShotClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_binary_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_binary_ner_en.md new file mode 100644 index 00000000000000..d6fc09554566a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_binary_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_binary_ner DistilBertForTokenClassification from Mahesh098 +author: John Snow Labs +name: distilbert_binary_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_binary_ner` is a English model originally trained by Mahesh098. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_binary_ner_en_5.5.1_3.0_1734310217474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_binary_ner_en_5.5.1_3.0_1734310217474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_binary_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_binary_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_binary_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Mahesh098/distilbert-binary-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_binary_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_binary_ner_pipeline_en.md new file mode 100644 index 00000000000000..4eff839ed7aa36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_binary_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_binary_ner_pipeline pipeline DistilBertForTokenClassification from Mahesh098 +author: John Snow Labs +name: distilbert_binary_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_binary_ner_pipeline` is a English model originally trained by Mahesh098. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_binary_ner_pipeline_en_5.5.1_3.0_1734310229922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_binary_ner_pipeline_en_5.5.1_3.0_1734310229922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_binary_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_binary_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_binary_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Mahesh098/distilbert-binary-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_clinical_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_clinical_ner_en.md new file mode 100644 index 00000000000000..598dda0d039489 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_clinical_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_clinical_ner DistilBertForTokenClassification from ribhu +author: John Snow Labs +name: distilbert_clinical_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_clinical_ner` is a English model originally trained by ribhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_clinical_ner_en_5.5.1_3.0_1734310184477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_clinical_ner_en_5.5.1_3.0_1734310184477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_clinical_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_clinical_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_clinical_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ribhu/distilbert-clinical-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_clinical_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_clinical_ner_pipeline_en.md new file mode 100644 index 00000000000000..c4be5fc52c27e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_clinical_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_clinical_ner_pipeline pipeline DistilBertForTokenClassification from ribhu +author: John Snow Labs +name: distilbert_clinical_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_clinical_ner_pipeline` is a English model originally trained by ribhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_clinical_ner_pipeline_en_5.5.1_3.0_1734310197481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_clinical_ner_pipeline_en_5.5.1_3.0_1734310197481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_clinical_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_clinical_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_clinical_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ribhu/distilbert-clinical-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_ner_finetuned_on_mountines_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_ner_finetuned_on_mountines_en.md new file mode 100644 index 00000000000000..42ef3bdb28ddbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_ner_finetuned_on_mountines_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_ner_finetuned_on_mountines DistilBertForTokenClassification from dimanoid12331 +author: John Snow Labs +name: distilbert_ner_finetuned_on_mountines +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_ner_finetuned_on_mountines` is a English model originally trained by dimanoid12331. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_finetuned_on_mountines_en_5.5.1_3.0_1734311073784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_finetuned_on_mountines_en_5.5.1_3.0_1734311073784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_finetuned_on_mountines","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_finetuned_on_mountines", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_finetuned_on_mountines| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dimanoid12331/distilbert-NER_finetuned_on_mountines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_ner_finetuned_on_mountines_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_ner_finetuned_on_mountines_pipeline_en.md new file mode 100644 index 00000000000000..8acb6bc82dfab7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_ner_finetuned_on_mountines_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_ner_finetuned_on_mountines_pipeline pipeline DistilBertForTokenClassification from dimanoid12331 +author: John Snow Labs +name: distilbert_ner_finetuned_on_mountines_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_ner_finetuned_on_mountines_pipeline` is a English model originally trained by dimanoid12331. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_finetuned_on_mountines_pipeline_en_5.5.1_3.0_1734311086562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_finetuned_on_mountines_pipeline_en_5.5.1_3.0_1734311086562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_ner_finetuned_on_mountines_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_ner_finetuned_on_mountines_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_finetuned_on_mountines_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dimanoid12331/distilbert-NER_finetuned_on_mountines + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_chidamnat2002_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_chidamnat2002_en.md new file mode 100644 index 00000000000000..96f49b33558ba2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_chidamnat2002_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_uncased_ner_lora_chidamnat2002 DistilBertForTokenClassification from chidamnat2002 +author: John Snow Labs +name: distilbert_uncased_ner_lora_chidamnat2002 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_uncased_ner_lora_chidamnat2002` is a English model originally trained by chidamnat2002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_chidamnat2002_en_5.5.1_3.0_1734310628098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_chidamnat2002_en_5.5.1_3.0_1734310628098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_uncased_ner_lora_chidamnat2002","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_uncased_ner_lora_chidamnat2002", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_uncased_ner_lora_chidamnat2002| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chidamnat2002/distilbert-uncased-NER-LoRA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_chidamnat2002_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_chidamnat2002_pipeline_en.md new file mode 100644 index 00000000000000..0b921d40b5eec0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_chidamnat2002_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_uncased_ner_lora_chidamnat2002_pipeline pipeline DistilBertForTokenClassification from chidamnat2002 +author: John Snow Labs +name: distilbert_uncased_ner_lora_chidamnat2002_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_uncased_ner_lora_chidamnat2002_pipeline` is a English model originally trained by chidamnat2002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_chidamnat2002_pipeline_en_5.5.1_3.0_1734310648141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_chidamnat2002_pipeline_en_5.5.1_3.0_1734310648141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_uncased_ner_lora_chidamnat2002_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_uncased_ner_lora_chidamnat2002_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_uncased_ner_lora_chidamnat2002_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chidamnat2002/distilbert-uncased-NER-LoRA + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_mozilla_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_mozilla_en.md new file mode 100644 index 00000000000000..e1cc2facff59bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_mozilla_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_uncased_ner_lora_mozilla DistilBertForTokenClassification from Mozilla +author: John Snow Labs +name: distilbert_uncased_ner_lora_mozilla +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_uncased_ner_lora_mozilla` is a English model originally trained by Mozilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_mozilla_en_5.5.1_3.0_1734310766440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_mozilla_en_5.5.1_3.0_1734310766440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_uncased_ner_lora_mozilla","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_uncased_ner_lora_mozilla", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_uncased_ner_lora_mozilla| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Mozilla/distilbert-uncased-NER-LoRA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_mozilla_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_mozilla_pipeline_en.md new file mode 100644 index 00000000000000..606d57dcf1624a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_uncased_ner_lora_mozilla_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_uncased_ner_lora_mozilla_pipeline pipeline DistilBertForTokenClassification from Mozilla +author: John Snow Labs +name: distilbert_uncased_ner_lora_mozilla_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_uncased_ner_lora_mozilla_pipeline` is a English model originally trained by Mozilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_mozilla_pipeline_en_5.5.1_3.0_1734310782093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_uncased_ner_lora_mozilla_pipeline_en_5.5.1_3.0_1734310782093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_uncased_ner_lora_mozilla_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_uncased_ner_lora_mozilla_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_uncased_ner_lora_mozilla_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Mozilla/distilbert-uncased-NER-LoRA + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_zeroshot_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_zeroshot_en.md new file mode 100644 index 00000000000000..4ffe32acfed762 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_zeroshot_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distilbert_zeroshot DistilBertForZeroShotClassification from AyoubChLin +author: John Snow Labs +name: distilbert_zeroshot +date: 2024-12-16 +tags: [en, open_source, onnx, zero_shot, distilbert] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_zeroshot` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_zeroshot_en_5.5.1_3.0_1734309558548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_zeroshot_en_5.5.1_3.0_1734309558548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_zeroshot","en") \ + .setInputCols(["document","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, zeroShotClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_zeroshot", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, zeroShotClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_zeroshot| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AyoubChLin/DistilBERT_ZeroShot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilbert_zeroshot_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilbert_zeroshot_pipeline_en.md new file mode 100644 index 00000000000000..91e2d1c61cf651 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilbert_zeroshot_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distilbert_zeroshot_pipeline pipeline DistilBertForZeroShotClassification from AyoubChLin +author: John Snow Labs +name: distilbert_zeroshot_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForZeroShotClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_zeroshot_pipeline` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_zeroshot_pipeline_en_5.5.1_3.0_1734309576647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_zeroshot_pipeline_en_5.5.1_3.0_1734309576647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilbert_zeroshot_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilbert_zeroshot_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_zeroshot_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AyoubChLin/DistilBERT_ZeroShot + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForZeroShotClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distillbert_clinical_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-distillbert_clinical_ner_en.md new file mode 100644 index 00000000000000..ecdf7e35f5c373 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distillbert_clinical_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English distillbert_clinical_ner DistilBertForTokenClassification from ribhu +author: John Snow Labs +name: distillbert_clinical_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_clinical_ner` is a English model originally trained by ribhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_clinical_ner_en_5.5.1_3.0_1734310044826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_clinical_ner_en_5.5.1_3.0_1734310044826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("distillbert_clinical_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distillbert_clinical_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_clinical_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ribhu/distillbert-clinical-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distillbert_clinical_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distillbert_clinical_ner_pipeline_en.md new file mode 100644 index 00000000000000..17c790534f381c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distillbert_clinical_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English distillbert_clinical_ner_pipeline pipeline DistilBertForTokenClassification from ribhu +author: John Snow Labs +name: distillbert_clinical_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_clinical_ner_pipeline` is a English model originally trained by ribhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_clinical_ner_pipeline_en_5.5.1_3.0_1734310064437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_clinical_ner_pipeline_en_5.5.1_3.0_1734310064437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distillbert_clinical_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distillbert_clinical_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_clinical_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ribhu/distillbert-clinical-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-distilvit_mozilla_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-distilvit_mozilla_pipeline_en.md new file mode 100644 index 00000000000000..460d1bcb71491e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-distilvit_mozilla_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English distilvit_mozilla_pipeline pipeline VisionEncoderDecoderForImageCaptioning from Mozilla +author: John Snow Labs +name: distilvit_mozilla_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilvit_mozilla_pipeline` is a English model originally trained by Mozilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilvit_mozilla_pipeline_en_5.5.1_3.0_1734318841176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilvit_mozilla_pipeline_en_5.5.1_3.0_1734318841176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("distilvit_mozilla_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("distilvit_mozilla_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilvit_mozilla_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|825.6 MB| + +## References + +https://huggingface.co/Mozilla/distilvit + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ds_en.md b/docs/_posts/ahmedlone127/2024-12-16-ds_en.md new file mode 100644 index 00000000000000..618776b53a6277 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ds_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English ds T5Transformer from zaid-farhan +author: John Snow Labs +name: ds +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ds` is a English model originally trained by zaid-farhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ds_en_5.5.1_3.0_1734331745094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ds_en_5.5.1_3.0_1734331745094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("ds","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("ds", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ds| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/zaid-farhan/ds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ds_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ds_pipeline_en.md new file mode 100644 index 00000000000000..bff94ca19fbda0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ds_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English ds_pipeline pipeline T5Transformer from zaid-farhan +author: John Snow Labs +name: ds_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ds_pipeline` is a English model originally trained by zaid-farhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ds_pipeline_en_5.5.1_3.0_1734331798702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ds_pipeline_en_5.5.1_3.0_1734331798702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ds_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ds_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ds_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/zaid-farhan/ds + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-english_lithuanian_t5_small_en.md b/docs/_posts/ahmedlone127/2024-12-16-english_lithuanian_t5_small_en.md new file mode 100644 index 00000000000000..e6a332042f1e59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-english_lithuanian_t5_small_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English english_lithuanian_t5_small T5Transformer from osmanh +author: John Snow Labs +name: english_lithuanian_t5_small +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_lithuanian_t5_small` is a English model originally trained by osmanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_lithuanian_t5_small_en_5.5.1_3.0_1734327336738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_lithuanian_t5_small_en_5.5.1_3.0_1734327336738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("english_lithuanian_t5_small","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("english_lithuanian_t5_small", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_lithuanian_t5_small| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|343.1 MB| + +## References + +https://huggingface.co/osmanh/en-lt-t5-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-english_lithuanian_t5_small_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-english_lithuanian_t5_small_pipeline_en.md new file mode 100644 index 00000000000000..c46c7eef1f69e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-english_lithuanian_t5_small_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English english_lithuanian_t5_small_pipeline pipeline T5Transformer from osmanh +author: John Snow Labs +name: english_lithuanian_t5_small_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_lithuanian_t5_small_pipeline` is a English model originally trained by osmanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_lithuanian_t5_small_pipeline_en_5.5.1_3.0_1734327355723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_lithuanian_t5_small_pipeline_en_5.5.1_3.0_1734327355723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_lithuanian_t5_small_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_lithuanian_t5_small_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_lithuanian_t5_small_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|343.1 MB| + +## References + +https://huggingface.co/osmanh/en-lt-t5-small + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_masked_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_masked_ner_en.md new file mode 100644 index 00000000000000..ff080d1c02f7ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_masked_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English english_multinerd_masked_ner DistilBertForTokenClassification from pariakashani +author: John Snow Labs +name: english_multinerd_masked_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_multinerd_masked_ner` is a English model originally trained by pariakashani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_multinerd_masked_ner_en_5.5.1_3.0_1734310044617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_multinerd_masked_ner_en_5.5.1_3.0_1734310044617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("english_multinerd_masked_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("english_multinerd_masked_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_multinerd_masked_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pariakashani/en-multinerd-masked-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_masked_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_masked_ner_pipeline_en.md new file mode 100644 index 00000000000000..30ff11e1f30408 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_masked_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English english_multinerd_masked_ner_pipeline pipeline DistilBertForTokenClassification from pariakashani +author: John Snow Labs +name: english_multinerd_masked_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_multinerd_masked_ner_pipeline` is a English model originally trained by pariakashani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_multinerd_masked_ner_pipeline_en_5.5.1_3.0_1734310064376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_multinerd_masked_ner_pipeline_en_5.5.1_3.0_1734310064376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_multinerd_masked_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_multinerd_masked_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_multinerd_masked_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pariakashani/en-multinerd-masked-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_ner_en.md new file mode 100644 index 00000000000000..20ca242917f300 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English english_multinerd_ner DistilBertForTokenClassification from pariakashani +author: John Snow Labs +name: english_multinerd_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_multinerd_ner` is a English model originally trained by pariakashani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_multinerd_ner_en_5.5.1_3.0_1734310630728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_multinerd_ner_en_5.5.1_3.0_1734310630728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("english_multinerd_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("english_multinerd_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_multinerd_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pariakashani/en-multinerd-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_ner_pipeline_en.md new file mode 100644 index 00000000000000..5f6bf8188a8bdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-english_multinerd_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English english_multinerd_ner_pipeline pipeline DistilBertForTokenClassification from pariakashani +author: John Snow Labs +name: english_multinerd_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_multinerd_ner_pipeline` is a English model originally trained by pariakashani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_multinerd_ner_pipeline_en_5.5.1_3.0_1734310645203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_multinerd_ner_pipeline_en_5.5.1_3.0_1734310645203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("english_multinerd_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("english_multinerd_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_multinerd_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/pariakashani/en-multinerd-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ethix4ai_en.md b/docs/_posts/ahmedlone127/2024-12-16-ethix4ai_en.md new file mode 100644 index 00000000000000..9d6294b2b66a32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ethix4ai_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ethix4ai DistilBertForTokenClassification from Somisetty2347 +author: John Snow Labs +name: ethix4ai +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ethix4ai` is a English model originally trained by Somisetty2347. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ethix4ai_en_5.5.1_3.0_1734310367562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ethix4ai_en_5.5.1_3.0_1734310367562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("ethix4ai","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("ethix4ai", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ethix4ai| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/Somisetty2347/Ethix4ai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ethix4ai_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ethix4ai_pipeline_en.md new file mode 100644 index 00000000000000..39fdcf9aeb1413 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ethix4ai_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ethix4ai_pipeline pipeline DistilBertForTokenClassification from Somisetty2347 +author: John Snow Labs +name: ethix4ai_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ethix4ai_pipeline` is a English model originally trained by Somisetty2347. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ethix4ai_pipeline_en_5.5.1_3.0_1734310379991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ethix4ai_pipeline_en_5.5.1_3.0_1734310379991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ethix4ai_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ethix4ai_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ethix4ai_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/Somisetty2347/Ethix4ai + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_french_hubert_s767_fr.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_french_hubert_s767_fr.md new file mode 100644 index 00000000000000..6fb138004150e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_french_hubert_s767_fr.md @@ -0,0 +1,84 @@ +--- +layout: model +title: French exp_w2v2t_french_hubert_s767 HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_french_hubert_s767 +date: 2024-12-16 +tags: [fr, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: fr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_french_hubert_s767` is a French model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_french_hubert_s767_fr_5.5.1_3.0_1734307983320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_french_hubert_s767_fr_5.5.1_3.0_1734307983320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("exp_w2v2t_french_hubert_s767","fr") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("exp_w2v2t_french_hubert_s767", "fr") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_french_hubert_s767| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|fr| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_fr_hubert_s767 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_french_hubert_s767_pipeline_fr.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_french_hubert_s767_pipeline_fr.md new file mode 100644 index 00000000000000..abbd307af95c11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_french_hubert_s767_pipeline_fr.md @@ -0,0 +1,69 @@ +--- +layout: model +title: French exp_w2v2t_french_hubert_s767_pipeline pipeline HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_french_hubert_s767_pipeline +date: 2024-12-16 +tags: [fr, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: fr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_french_hubert_s767_pipeline` is a French model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_french_hubert_s767_pipeline_fr_5.5.1_3.0_1734308096795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_french_hubert_s767_pipeline_fr_5.5.1_3.0_1734308096795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("exp_w2v2t_french_hubert_s767_pipeline", lang = "fr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("exp_w2v2t_french_hubert_s767_pipeline", lang = "fr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_french_hubert_s767_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|fr| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_fr_hubert_s767 + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s126_de.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s126_de.md new file mode 100644 index 00000000000000..890fccd0741138 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s126_de.md @@ -0,0 +1,84 @@ +--- +layout: model +title: German exp_w2v2t_german_hubert_s126 HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_german_hubert_s126 +date: 2024-12-16 +tags: [de, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: de +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_german_hubert_s126` is a German model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s126_de_5.5.1_3.0_1734308149135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s126_de_5.5.1_3.0_1734308149135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("exp_w2v2t_german_hubert_s126","de") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("exp_w2v2t_german_hubert_s126", "de") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_german_hubert_s126| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|de| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_de_hubert_s126 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s126_pipeline_de.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s126_pipeline_de.md new file mode 100644 index 00000000000000..970c6dfe33e02f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s126_pipeline_de.md @@ -0,0 +1,69 @@ +--- +layout: model +title: German exp_w2v2t_german_hubert_s126_pipeline pipeline HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_german_hubert_s126_pipeline +date: 2024-12-16 +tags: [de, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: de +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_german_hubert_s126_pipeline` is a German model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s126_pipeline_de_5.5.1_3.0_1734308270073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s126_pipeline_de_5.5.1_3.0_1734308270073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("exp_w2v2t_german_hubert_s126_pipeline", lang = "de") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("exp_w2v2t_german_hubert_s126_pipeline", lang = "de") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_german_hubert_s126_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|de| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_de_hubert_s126 + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s55_de.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s55_de.md new file mode 100644 index 00000000000000..3b0ffc325f7dfd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s55_de.md @@ -0,0 +1,84 @@ +--- +layout: model +title: German exp_w2v2t_german_hubert_s55 HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_german_hubert_s55 +date: 2024-12-16 +tags: [de, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: de +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_german_hubert_s55` is a German model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s55_de_5.5.1_3.0_1734308621089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s55_de_5.5.1_3.0_1734308621089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("exp_w2v2t_german_hubert_s55","de") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("exp_w2v2t_german_hubert_s55", "de") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_german_hubert_s55| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|de| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_de_hubert_s55 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s55_pipeline_de.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s55_pipeline_de.md new file mode 100644 index 00000000000000..c4332b8a8565de --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_german_hubert_s55_pipeline_de.md @@ -0,0 +1,69 @@ +--- +layout: model +title: German exp_w2v2t_german_hubert_s55_pipeline pipeline HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_german_hubert_s55_pipeline +date: 2024-12-16 +tags: [de, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: de +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_german_hubert_s55_pipeline` is a German model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s55_pipeline_de_5.5.1_3.0_1734308740343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_german_hubert_s55_pipeline_de_5.5.1_3.0_1734308740343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("exp_w2v2t_german_hubert_s55_pipeline", lang = "de") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("exp_w2v2t_german_hubert_s55_pipeline", lang = "de") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_german_hubert_s55_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|de| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_de_hubert_s55 + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s456_es.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s456_es.md new file mode 100644 index 00000000000000..e1ddd0c8eb718e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s456_es.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Castilian, Spanish exp_w2v2t_spanish_hubert_s456 HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_spanish_hubert_s456 +date: 2024-12-16 +tags: [es, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_spanish_hubert_s456` is a Castilian, Spanish model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s456_es_5.5.1_3.0_1734308731554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s456_es_5.5.1_3.0_1734308731554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("exp_w2v2t_spanish_hubert_s456","es") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("exp_w2v2t_spanish_hubert_s456", "es") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_spanish_hubert_s456| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|es| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_es_hubert_s456 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s456_pipeline_es.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s456_pipeline_es.md new file mode 100644 index 00000000000000..1027c9f565089e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s456_pipeline_es.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Castilian, Spanish exp_w2v2t_spanish_hubert_s456_pipeline pipeline HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_spanish_hubert_s456_pipeline +date: 2024-12-16 +tags: [es, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_spanish_hubert_s456_pipeline` is a Castilian, Spanish model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s456_pipeline_es_5.5.1_3.0_1734308860901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s456_pipeline_es_5.5.1_3.0_1734308860901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("exp_w2v2t_spanish_hubert_s456_pipeline", lang = "es") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("exp_w2v2t_spanish_hubert_s456_pipeline", lang = "es") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_spanish_hubert_s456_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|es| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_es_hubert_s456 + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s459_es.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s459_es.md new file mode 100644 index 00000000000000..389553e088db74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s459_es.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Castilian, Spanish exp_w2v2t_spanish_hubert_s459 HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_spanish_hubert_s459 +date: 2024-12-16 +tags: [es, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_spanish_hubert_s459` is a Castilian, Spanish model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s459_es_5.5.1_3.0_1734308013695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s459_es_5.5.1_3.0_1734308013695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("exp_w2v2t_spanish_hubert_s459","es") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("exp_w2v2t_spanish_hubert_s459", "es") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_spanish_hubert_s459| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|es| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_es_hubert_s459 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s459_pipeline_es.md b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s459_pipeline_es.md new file mode 100644 index 00000000000000..efd2586ef86bda --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-exp_w2v2t_spanish_hubert_s459_pipeline_es.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Castilian, Spanish exp_w2v2t_spanish_hubert_s459_pipeline pipeline HubertForCTC from jonatasgrosman +author: John Snow Labs +name: exp_w2v2t_spanish_hubert_s459_pipeline +date: 2024-12-16 +tags: [es, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`exp_w2v2t_spanish_hubert_s459_pipeline` is a Castilian, Spanish model originally trained by jonatasgrosman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s459_pipeline_es_5.5.1_3.0_1734308126376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/exp_w2v2t_spanish_hubert_s459_pipeline_es_5.5.1_3.0_1734308126376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("exp_w2v2t_spanish_hubert_s459_pipeline", lang = "es") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("exp_w2v2t_spanish_hubert_s459_pipeline", lang = "es") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|exp_w2v2t_spanish_hubert_s459_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|es| +|Size:|2.4 GB| + +## References + +https://huggingface.co/jonatasgrosman/exp_w2v2t_es_hubert_s459 + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-feature_extraction_model_damage_trigger_effect_location_naacl_2025_en.md b/docs/_posts/ahmedlone127/2024-12-16-feature_extraction_model_damage_trigger_effect_location_naacl_2025_en.md new file mode 100644 index 00000000000000..111ca03818ec9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-feature_extraction_model_damage_trigger_effect_location_naacl_2025_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English feature_extraction_model_damage_trigger_effect_location_naacl_2025 BertForTokenClassification from Lolimorimorf +author: John Snow Labs +name: feature_extraction_model_damage_trigger_effect_location_naacl_2025 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feature_extraction_model_damage_trigger_effect_location_naacl_2025` is a English model originally trained by Lolimorimorf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feature_extraction_model_damage_trigger_effect_location_naacl_2025_en_5.5.1_3.0_1734337615361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feature_extraction_model_damage_trigger_effect_location_naacl_2025_en_5.5.1_3.0_1734337615361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("feature_extraction_model_damage_trigger_effect_location_naacl_2025","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("feature_extraction_model_damage_trigger_effect_location_naacl_2025", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feature_extraction_model_damage_trigger_effect_location_naacl_2025| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/Lolimorimorf/feature_extraction_model_damage_trigger_effect_location_naacl_2025 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-finance_chatbot_flan_t5_base_en.md b/docs/_posts/ahmedlone127/2024-12-16-finance_chatbot_flan_t5_base_en.md new file mode 100644 index 00000000000000..273e734caac5b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-finance_chatbot_flan_t5_base_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English finance_chatbot_flan_t5_base T5Transformer from ai1-test +author: John Snow Labs +name: finance_chatbot_flan_t5_base +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_chatbot_flan_t5_base` is a English model originally trained by ai1-test. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_chatbot_flan_t5_base_en_5.5.1_3.0_1734330607849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_chatbot_flan_t5_base_en_5.5.1_3.0_1734330607849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("finance_chatbot_flan_t5_base","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("finance_chatbot_flan_t5_base", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_chatbot_flan_t5_base| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ai1-test/finance-chatbot-flan-t5-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-finance_chatbot_flan_t5_base_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-finance_chatbot_flan_t5_base_pipeline_en.md new file mode 100644 index 00000000000000..71df30da184743 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-finance_chatbot_flan_t5_base_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English finance_chatbot_flan_t5_base_pipeline pipeline T5Transformer from ai1-test +author: John Snow Labs +name: finance_chatbot_flan_t5_base_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_chatbot_flan_t5_base_pipeline` is a English model originally trained by ai1-test. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_chatbot_flan_t5_base_pipeline_en_5.5.1_3.0_1734330664765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_chatbot_flan_t5_base_pipeline_en_5.5.1_3.0_1734330664765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finance_chatbot_flan_t5_base_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finance_chatbot_flan_t5_base_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_chatbot_flan_t5_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ai1-test/finance-chatbot-flan-t5-base + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v11_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v11_pipeline_en.md new file mode 100644 index 00000000000000..b4ed40c5183419 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v11_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v11_pipeline pipeline T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v11_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v11_pipeline` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v11_pipeline_en_5.5.1_3.0_1734328206036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v11_pipeline_en_5.5.1_3.0_1734328206036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v11_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v11_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v11_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|319.8 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v11 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v3_en.md b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v3_en.md new file mode 100644 index 00000000000000..ee22521a5cc713 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v3_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v3 T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v3 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v3` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v3_en_5.5.1_3.0_1734328039154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v3_en_5.5.1_3.0_1734328039154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("fine_tuned_t5_small_model_sec_5_v3","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("fine_tuned_t5_small_model_sec_5_v3", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|318.0 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v3_pipeline_en.md new file mode 100644 index 00000000000000..801690b8c7a10e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v3_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v3_pipeline pipeline T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v3_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v3_pipeline` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v3_pipeline_en_5.5.1_3.0_1734328061806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v3_pipeline_en_5.5.1_3.0_1734328061806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|318.0 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v3 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v6_en.md b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v6_en.md new file mode 100644 index 00000000000000..4c3d62da7ae5fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v6_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v6 T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v6 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v6` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v6_en_5.5.1_3.0_1734331128583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v6_en_5.5.1_3.0_1734331128583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("fine_tuned_t5_small_model_sec_5_v6","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("fine_tuned_t5_small_model_sec_5_v6", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v6| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|316.0 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v6_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v6_pipeline_en.md new file mode 100644 index 00000000000000..107ae3e43189df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-fine_tuned_t5_small_model_sec_5_v6_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English fine_tuned_t5_small_model_sec_5_v6_pipeline pipeline T5Transformer from miasetya +author: John Snow Labs +name: fine_tuned_t5_small_model_sec_5_v6_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_t5_small_model_sec_5_v6_pipeline` is a English model originally trained by miasetya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v6_pipeline_en_5.5.1_3.0_1734331151438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_t5_small_model_sec_5_v6_pipeline_en_5.5.1_3.0_1734331151438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v6_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("fine_tuned_t5_small_model_sec_5_v6_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_t5_small_model_sec_5_v6_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|316.0 MB| + +## References + +https://huggingface.co/miasetya/fine_tuned_t5_small_model_sec_5_v6 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-finetuned_answer_trans_mixed_mechanical_data_french_2_en.md b/docs/_posts/ahmedlone127/2024-12-16-finetuned_answer_trans_mixed_mechanical_data_french_2_en.md new file mode 100644 index 00000000000000..fc77821c5726e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-finetuned_answer_trans_mixed_mechanical_data_french_2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English finetuned_answer_trans_mixed_mechanical_data_french_2 T5Transformer from amiraMamdouh +author: John Snow Labs +name: finetuned_answer_trans_mixed_mechanical_data_french_2 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_answer_trans_mixed_mechanical_data_french_2` is a English model originally trained by amiraMamdouh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_answer_trans_mixed_mechanical_data_french_2_en_5.5.1_3.0_1734327484867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_answer_trans_mixed_mechanical_data_french_2_en_5.5.1_3.0_1734327484867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("finetuned_answer_trans_mixed_mechanical_data_french_2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("finetuned_answer_trans_mixed_mechanical_data_french_2", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_answer_trans_mixed_mechanical_data_french_2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|347.7 MB| + +## References + +https://huggingface.co/amiraMamdouh/finetuned_answer_trans_mixed_mechanical_data_fr_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline_en.md new file mode 100644 index 00000000000000..54ef35f0af8a2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline pipeline T5Transformer from amiraMamdouh +author: John Snow Labs +name: finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline` is a English model originally trained by amiraMamdouh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline_en_5.5.1_3.0_1734327502780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline_en_5.5.1_3.0_1734327502780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_answer_trans_mixed_mechanical_data_french_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|347.7 MB| + +## References + +https://huggingface.co/amiraMamdouh/finetuned_answer_trans_mixed_mechanical_data_fr_2 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_hh_dpo_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_hh_dpo_en.md new file mode 100644 index 00000000000000..2feb4c15499266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_hh_dpo_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_hh_dpo T5Transformer from Jise +author: John Snow Labs +name: flan_t5_hh_dpo +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_hh_dpo` is a English model originally trained by Jise. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_hh_dpo_en_5.5.1_3.0_1734329007333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_hh_dpo_en_5.5.1_3.0_1734329007333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_hh_dpo","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_hh_dpo", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_hh_dpo| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Jise/flan-t5-hh-dpo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_hh_dpo_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_hh_dpo_pipeline_en.md new file mode 100644 index 00000000000000..7fb378c3fdc0d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_hh_dpo_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_hh_dpo_pipeline pipeline T5Transformer from Jise +author: John Snow Labs +name: flan_t5_hh_dpo_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_hh_dpo_pipeline` is a English model originally trained by Jise. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_hh_dpo_pipeline_en_5.5.1_3.0_1734329059823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_hh_dpo_pipeline_en_5.5.1_3.0_1734329059823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_hh_dpo_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_hh_dpo_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_hh_dpo_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Jise/flan-t5-hh-dpo + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_qa_study_assistant_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_qa_study_assistant_pipeline_en.md new file mode 100644 index 00000000000000..54216a17eb7f49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_qa_study_assistant_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_qa_study_assistant_pipeline pipeline T5Transformer from tootooba +author: John Snow Labs +name: flan_t5_qa_study_assistant_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_qa_study_assistant_pipeline` is a English model originally trained by tootooba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_qa_study_assistant_pipeline_en_5.5.1_3.0_1734332530830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_qa_study_assistant_pipeline_en_5.5.1_3.0_1734332530830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_qa_study_assistant_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_qa_study_assistant_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_qa_study_assistant_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/tootooba/flan-t5-qa-study-assistant + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_3b_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_3b_en.md new file mode 100644 index 00000000000000..19270562b92e73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_3b_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_rouge_durga_3b T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_3b +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_3b` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_3b_en_5.5.1_3.0_1734333095633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_3b_en_5.5.1_3.0_1734333095633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_rouge_durga_3b","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_rouge_durga_3b", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_3b| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-3b \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_3b_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_3b_pipeline_en.md new file mode 100644 index 00000000000000..40bc6d9533529d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_3b_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_rouge_durga_3b_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_3b_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_3b_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_3b_pipeline_en_5.5.1_3.0_1734333145516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_3b_pipeline_en_5.5.1_3.0_1734333145516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_rouge_durga_3b_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_rouge_durga_3b_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_3b_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-3b + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_2_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_2_en.md new file mode 100644 index 00000000000000..ec42af10bcb8f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_2 T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_2 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_2` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_2_en_5.5.1_3.0_1734332153342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_2_en_5.5.1_3.0_1734332153342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_2", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_2_pipeline_en.md new file mode 100644 index 00000000000000..5e72cf3ef83343 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_2_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_2_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_2_pipeline_en_5.5.1_3.0_1734332203850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_2_pipeline_en_5.5.1_3.0_1734332203850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_rouge_durga_q5_clean_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_rouge_durga_q5_clean_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-2 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4c_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4c_en.md new file mode 100644 index 00000000000000..db08b4726029d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4c_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_4c T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_4c +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_4c` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4c_en_5.5.1_3.0_1734332640434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4c_en_5.5.1_3.0_1734332640434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_4c","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_4c", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_4c| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-4c \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4c_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4c_pipeline_en.md new file mode 100644 index 00000000000000..785546605d0f72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4c_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_4c_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_4c_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_4c_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4c_pipeline_en_5.5.1_3.0_1734332695475.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4c_pipeline_en_5.5.1_3.0_1734332695475.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_rouge_durga_q5_clean_4c_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_rouge_durga_q5_clean_4c_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_4c_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-4c + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4d_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4d_en.md new file mode 100644 index 00000000000000..1cc8d677d8355c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4d_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_4d T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_4d +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_4d` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4d_en_5.5.1_3.0_1734328631803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4d_en_5.5.1_3.0_1734328631803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_4d","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_rouge_durga_q5_clean_4d", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_4d| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-4d \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4d_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4d_pipeline_en.md new file mode 100644 index 00000000000000..642fac6a157e09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_durga_q5_clean_4d_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_rouge_durga_q5_clean_4d_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_durga_q5_clean_4d_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_durga_q5_clean_4d_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4d_pipeline_en_5.5.1_3.0_1734328685639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_durga_q5_clean_4d_pipeline_en_5.5.1_3.0_1734328685639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_rouge_durga_q5_clean_4d_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_rouge_durga_q5_clean_4d_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_durga_q5_clean_4d_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-durga-q5-clean-4d + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_muhammad_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_muhammad_pipeline_en.md new file mode 100644 index 00000000000000..65156584f0ba19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_rouge_muhammad_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_rouge_muhammad_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: flan_t5_rouge_muhammad_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_rouge_muhammad_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_muhammad_pipeline_en_5.5.1_3.0_1734332864959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_rouge_muhammad_pipeline_en_5.5.1_3.0_1734332864959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_rouge_muhammad_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_rouge_muhammad_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_rouge_muhammad_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/devagonal/flan-t5-rouge-muhammad + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_small_title_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_small_title_en.md new file mode 100644 index 00000000000000..8a557c905647fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_small_title_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English flan_t5_small_title T5Transformer from agentlans +author: John Snow Labs +name: flan_t5_small_title +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_title` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_title_en_5.5.1_3.0_1734328518247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_title_en_5.5.1_3.0_1734328518247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("flan_t5_small_title","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("flan_t5_small_title", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_title| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/agentlans/flan-t5-small-title \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-flan_t5_small_title_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_small_title_pipeline_en.md new file mode 100644 index 00000000000000..70f30613f5f4c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-flan_t5_small_title_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English flan_t5_small_title_pipeline pipeline T5Transformer from agentlans +author: John Snow Labs +name: flan_t5_small_title_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flan_t5_small_title_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flan_t5_small_title_pipeline_en_5.5.1_3.0_1734328536311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flan_t5_small_title_pipeline_en_5.5.1_3.0_1734328536311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("flan_t5_small_title_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("flan_t5_small_title_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flan_t5_small_title_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/agentlans/flan-t5-small-title + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ft_t5_small_ganda_en.md b/docs/_posts/ahmedlone127/2024-12-16-ft_t5_small_ganda_en.md new file mode 100644 index 00000000000000..29984bc5e22a3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ft_t5_small_ganda_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English ft_t5_small_ganda T5Transformer from MubarakB +author: John Snow Labs +name: ft_t5_small_ganda +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ft_t5_small_ganda` is a English model originally trained by MubarakB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ft_t5_small_ganda_en_5.5.1_3.0_1734328812865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ft_t5_small_ganda_en_5.5.1_3.0_1734328812865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("ft_t5_small_ganda","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("ft_t5_small_ganda", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ft_t5_small_ganda| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|338.0 MB| + +## References + +https://huggingface.co/MubarakB/ft-t5-small-lg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ft_t5_small_ganda_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ft_t5_small_ganda_pipeline_en.md new file mode 100644 index 00000000000000..bbd59c64c271a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ft_t5_small_ganda_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English ft_t5_small_ganda_pipeline pipeline T5Transformer from MubarakB +author: John Snow Labs +name: ft_t5_small_ganda_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ft_t5_small_ganda_pipeline` is a English model originally trained by MubarakB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ft_t5_small_ganda_pipeline_en_5.5.1_3.0_1734328832852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ft_t5_small_ganda_pipeline_en_5.5.1_3.0_1734328832852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ft_t5_small_ganda_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ft_t5_small_ganda_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ft_t5_small_ganda_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|338.0 MB| + +## References + +https://huggingface.co/MubarakB/ft-t5-small-lg + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ll60k_librispeech_multi_gpu_en.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ll60k_librispeech_multi_gpu_en.md new file mode 100644 index 00000000000000..53e434b55b7259 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ll60k_librispeech_multi_gpu_en.md @@ -0,0 +1,84 @@ +--- +layout: model +title: English hubert_large_ll60k_librispeech_multi_gpu HubertForCTC from r-sharma-coder +author: John Snow Labs +name: hubert_large_ll60k_librispeech_multi_gpu +date: 2024-12-16 +tags: [en, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_large_ll60k_librispeech_multi_gpu` is a English model originally trained by r-sharma-coder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_large_ll60k_librispeech_multi_gpu_en_5.5.1_3.0_1734308987645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_large_ll60k_librispeech_multi_gpu_en_5.5.1_3.0_1734308987645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("hubert_large_ll60k_librispeech_multi_gpu","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("hubert_large_ll60k_librispeech_multi_gpu", "en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_large_ll60k_librispeech_multi_gpu| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|2.4 GB| + +## References + +https://huggingface.co/r-sharma-coder/hubert-large-ll60k-librispeech-multi-gpu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ll60k_librispeech_multi_gpu_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ll60k_librispeech_multi_gpu_pipeline_en.md new file mode 100644 index 00000000000000..503bddf1212ee8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ll60k_librispeech_multi_gpu_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English hubert_large_ll60k_librispeech_multi_gpu_pipeline pipeline HubertForCTC from r-sharma-coder +author: John Snow Labs +name: hubert_large_ll60k_librispeech_multi_gpu_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_large_ll60k_librispeech_multi_gpu_pipeline` is a English model originally trained by r-sharma-coder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_large_ll60k_librispeech_multi_gpu_pipeline_en_5.5.1_3.0_1734309111102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_large_ll60k_librispeech_multi_gpu_pipeline_en_5.5.1_3.0_1734309111102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hubert_large_ll60k_librispeech_multi_gpu_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hubert_large_ll60k_librispeech_multi_gpu_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_large_ll60k_librispeech_multi_gpu_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.4 GB| + +## References + +https://huggingface.co/r-sharma-coder/hubert-large-ll60k-librispeech-multi-gpu + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ls960_ft_v2_en.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ls960_ft_v2_en.md new file mode 100644 index 00000000000000..1766ea8b7e2c17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ls960_ft_v2_en.md @@ -0,0 +1,84 @@ +--- +layout: model +title: English hubert_large_ls960_ft_v2 HubertForCTC from nrshoudi +author: John Snow Labs +name: hubert_large_ls960_ft_v2 +date: 2024-12-16 +tags: [en, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_large_ls960_ft_v2` is a English model originally trained by nrshoudi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_large_ls960_ft_v2_en_5.5.1_3.0_1734308874008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_large_ls960_ft_v2_en_5.5.1_3.0_1734308874008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("hubert_large_ls960_ft_v2","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("hubert_large_ls960_ft_v2", "en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_large_ls960_ft_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|2.4 GB| + +## References + +https://huggingface.co/nrshoudi/hubert-large-ls960-ft-V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ls960_ft_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ls960_ft_v2_pipeline_en.md new file mode 100644 index 00000000000000..2591f4aa5edf65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_large_ls960_ft_v2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English hubert_large_ls960_ft_v2_pipeline pipeline HubertForCTC from nrshoudi +author: John Snow Labs +name: hubert_large_ls960_ft_v2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_large_ls960_ft_v2_pipeline` is a English model originally trained by nrshoudi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_large_ls960_ft_v2_pipeline_en_5.5.1_3.0_1734308998786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_large_ls960_ft_v2_pipeline_en_5.5.1_3.0_1734308998786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hubert_large_ls960_ft_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hubert_large_ls960_ft_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_large_ls960_ft_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.4 GB| + +## References + +https://huggingface.co/nrshoudi/hubert-large-ls960-ft-V2 + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_qa_milqa_impossible_hu.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_qa_milqa_impossible_hu.md new file mode 100644 index 00000000000000..0ecd2b493a241b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_qa_milqa_impossible_hu.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Hungarian hubert_qa_milqa_impossible BertForQuestionAnswering from ZTamas +author: John Snow Labs +name: hubert_qa_milqa_impossible +date: 2024-12-16 +tags: [hu, open_source, onnx, question_answering, bert] +task: Question Answering +language: hu +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_qa_milqa_impossible` is a Hungarian model originally trained by ZTamas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_qa_milqa_impossible_hu_5.5.1_3.0_1734338520581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_qa_milqa_impossible_hu_5.5.1_3.0_1734338520581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("hubert_qa_milqa_impossible","hu") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("hubert_qa_milqa_impossible", "hu") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_qa_milqa_impossible| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|hu| +|Size:|412.4 MB| + +## References + +https://huggingface.co/ZTamas/hubert-qa-milqa-impossible \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_tiny_unit_en.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_tiny_unit_en.md new file mode 100644 index 00000000000000..1e68e08a04311c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_tiny_unit_en.md @@ -0,0 +1,84 @@ +--- +layout: model +title: English hubert_tiny_unit HubertForCTC from voidful +author: John Snow Labs +name: hubert_tiny_unit +date: 2024-12-16 +tags: [en, open_source, onnx, asr, hubert] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: HubertForCTC +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_tiny_unit` is a English model originally trained by voidful. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_tiny_unit_en_5.5.1_3.0_1734308883914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_tiny_unit_en_5.5.1_3.0_1734308883914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +audioAssembler = AudioAssembler() \ + .setInputCol("audio_content") \ + .setOutputCol("audio_assembler") + +speechToText = HubertForCTC.pretrained("hubert_tiny_unit","en") \ + .setInputCols(["audio_assembler"]) \ + .setOutputCol("text") + +pipeline = Pipeline().setStages([audioAssembler, speechToText]) +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val audioAssembler = new DocumentAssembler() + .setInputCols("audio_content") + .setOutputCols("audio_assembler") + +val speechToText = HubertForCTC.pretrained("hubert_tiny_unit", "en") + .setInputCols(Array("audio_assembler")) + .setOutputCol("text") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, speechToText)) +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_tiny_unit| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[audio_assembler]| +|Output Labels:|[text]| +|Language:|en| +|Size:|126.6 MB| + +## References + +https://huggingface.co/voidful/hubert-tiny-unit \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-hubert_tiny_unit_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-hubert_tiny_unit_pipeline_en.md new file mode 100644 index 00000000000000..6010b569e4b25a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-hubert_tiny_unit_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English hubert_tiny_unit_pipeline pipeline HubertForCTC from voidful +author: John Snow Labs +name: hubert_tiny_unit_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Automatic Speech Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained HubertForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_tiny_unit_pipeline` is a English model originally trained by voidful. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_tiny_unit_pipeline_en_5.5.1_3.0_1734308909217.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_tiny_unit_pipeline_en_5.5.1_3.0_1734308909217.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("hubert_tiny_unit_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("hubert_tiny_unit_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_tiny_unit_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|126.6 MB| + +## References + +https://huggingface.co/voidful/hubert-tiny-unit + +## Included Models + +- AudioAssembler +- HubertForCTC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-icelandic_this_furry_en.md b/docs/_posts/ahmedlone127/2024-12-16-icelandic_this_furry_en.md new file mode 100644 index 00000000000000..546d554958113d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-icelandic_this_furry_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English icelandic_this_furry SwinForImageClassification from SaintGermain +author: John Snow Labs +name: icelandic_this_furry +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icelandic_this_furry` is a English model originally trained by SaintGermain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icelandic_this_furry_en_5.5.1_3.0_1734325605528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icelandic_this_furry_en_5.5.1_3.0_1734325605528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""icelandic_this_furry","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("icelandic_this_furry","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icelandic_this_furry| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/SaintGermain/is-this-furry \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-icelandic_this_furry_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-icelandic_this_furry_pipeline_en.md new file mode 100644 index 00000000000000..c1cbfed09714eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-icelandic_this_furry_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English icelandic_this_furry_pipeline pipeline SwinForImageClassification from SaintGermain +author: John Snow Labs +name: icelandic_this_furry_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icelandic_this_furry_pipeline` is a English model originally trained by SaintGermain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icelandic_this_furry_pipeline_en_5.5.1_3.0_1734325644167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icelandic_this_furry_pipeline_en_5.5.1_3.0_1734325644167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("icelandic_this_furry_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("icelandic_this_furry_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icelandic_this_furry_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/SaintGermain/is-this-furry + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_en.md b/docs/_posts/ahmedlone127/2024-12-16-idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_en.md new file mode 100644 index 00000000000000..5b9dcf739c9739 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English idt5_base_qa_qg_baseline_tydiqa_indonesian_hl T5Transformer from hawalurahman +author: John Snow Labs +name: idt5_base_qa_qg_baseline_tydiqa_indonesian_hl +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idt5_base_qa_qg_baseline_tydiqa_indonesian_hl` is a English model originally trained by hawalurahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_en_5.5.1_3.0_1734328620148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_en_5.5.1_3.0_1734328620148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("idt5_base_qa_qg_baseline_tydiqa_indonesian_hl","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("idt5_base_qa_qg_baseline_tydiqa_indonesian_hl", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idt5_base_qa_qg_baseline_tydiqa_indonesian_hl| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|974.8 MB| + +## References + +https://huggingface.co/hawalurahman/idt5-base-qa-qg-baseline-TydiQA-id_hl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline_en.md new file mode 100644 index 00000000000000..349a7c70e74630 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline pipeline T5Transformer from hawalurahman +author: John Snow Labs +name: idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline` is a English model originally trained by hawalurahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline_en_5.5.1_3.0_1734328670544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline_en_5.5.1_3.0_1734328670544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idt5_base_qa_qg_baseline_tydiqa_indonesian_hl_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|974.8 MB| + +## References + +https://huggingface.co/hawalurahman/idt5-base-qa-qg-baseline-TydiQA-id_hl + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-indoqa_id.md b/docs/_posts/ahmedlone127/2024-12-16-indoqa_id.md new file mode 100644 index 00000000000000..d6d38139ed952a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-indoqa_id.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Indonesian indoqa BertForQuestionAnswering from digo-prayudha +author: John Snow Labs +name: indoqa +date: 2024-12-16 +tags: [id, open_source, onnx, question_answering, bert] +task: Question Answering +language: id +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indoqa` is a Indonesian model originally trained by digo-prayudha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indoqa_id_5.5.1_3.0_1734339212446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indoqa_id_5.5.1_3.0_1734339212446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("indoqa","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("indoqa", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indoqa| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|411.7 MB| + +## References + +https://huggingface.co/digo-prayudha/IndoQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-indoqa_pipeline_id.md b/docs/_posts/ahmedlone127/2024-12-16-indoqa_pipeline_id.md new file mode 100644 index 00000000000000..90b6f1346e8a1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-indoqa_pipeline_id.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Indonesian indoqa_pipeline pipeline BertForQuestionAnswering from digo-prayudha +author: John Snow Labs +name: indoqa_pipeline +date: 2024-12-16 +tags: [id, open_source, pipeline, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indoqa_pipeline` is a Indonesian model originally trained by digo-prayudha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indoqa_pipeline_id_5.5.1_3.0_1734339236175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indoqa_pipeline_id_5.5.1_3.0_1734339236175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("indoqa_pipeline", lang = "id") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("indoqa_pipeline", lang = "id") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indoqa_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|id| +|Size:|411.7 MB| + +## References + +https://huggingface.co/digo-prayudha/IndoQA + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-judge_answer___33_deberta_large_enwiki_answerability_2411_en.md b/docs/_posts/ahmedlone127/2024-12-16-judge_answer___33_deberta_large_enwiki_answerability_2411_en.md new file mode 100644 index 00000000000000..14084981888c57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-judge_answer___33_deberta_large_enwiki_answerability_2411_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English judge_answer___33_deberta_large_enwiki_answerability_2411 DeBertaForSequenceClassification from tom-010 +author: John Snow Labs +name: judge_answer___33_deberta_large_enwiki_answerability_2411 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`judge_answer___33_deberta_large_enwiki_answerability_2411` is a English model originally trained by tom-010. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/judge_answer___33_deberta_large_enwiki_answerability_2411_en_5.5.1_3.0_1734312039742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/judge_answer___33_deberta_large_enwiki_answerability_2411_en_5.5.1_3.0_1734312039742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("judge_answer___33_deberta_large_enwiki_answerability_2411","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("judge_answer___33_deberta_large_enwiki_answerability_2411", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|judge_answer___33_deberta_large_enwiki_answerability_2411| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/tom-010/judge_answer___33_deberta_large_enwiki-answerability-2411 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline_en.md new file mode 100644 index 00000000000000..7fad26682f3102 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline pipeline DeBertaForSequenceClassification from tom-010 +author: John Snow Labs +name: judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline` is a English model originally trained by tom-010. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline_en_5.5.1_3.0_1734312126613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline_en_5.5.1_3.0_1734312126613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|judge_answer___33_deberta_large_enwiki_answerability_2411_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/tom-010/judge_answer___33_deberta_large_enwiki-answerability-2411 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-kazparc_english_russian_model_1_en.md b/docs/_posts/ahmedlone127/2024-12-16-kazparc_english_russian_model_1_en.md new file mode 100644 index 00000000000000..f560ec2eb74a26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-kazparc_english_russian_model_1_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English kazparc_english_russian_model_1 T5Transformer from Goshective +author: John Snow Labs +name: kazparc_english_russian_model_1 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazparc_english_russian_model_1` is a English model originally trained by Goshective. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazparc_english_russian_model_1_en_5.5.1_3.0_1734331727649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazparc_english_russian_model_1_en_5.5.1_3.0_1734331727649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("kazparc_english_russian_model_1","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("kazparc_english_russian_model_1", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazparc_english_russian_model_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|343.9 MB| + +## References + +https://huggingface.co/Goshective/kazparc_en_ru_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-kazparc_english_russian_model_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-kazparc_english_russian_model_1_pipeline_en.md new file mode 100644 index 00000000000000..86b067118652eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-kazparc_english_russian_model_1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English kazparc_english_russian_model_1_pipeline pipeline T5Transformer from Goshective +author: John Snow Labs +name: kazparc_english_russian_model_1_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazparc_english_russian_model_1_pipeline` is a English model originally trained by Goshective. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazparc_english_russian_model_1_pipeline_en_5.5.1_3.0_1734331746326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazparc_english_russian_model_1_pipeline_en_5.5.1_3.0_1734331746326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("kazparc_english_russian_model_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("kazparc_english_russian_model_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazparc_english_russian_model_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|343.9 MB| + +## References + +https://huggingface.co/Goshective/kazparc_en_ru_model_1 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-kazparc_russian_english_model_1_goshective_en.md b/docs/_posts/ahmedlone127/2024-12-16-kazparc_russian_english_model_1_goshective_en.md new file mode 100644 index 00000000000000..c55aac39c665ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-kazparc_russian_english_model_1_goshective_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English kazparc_russian_english_model_1_goshective T5Transformer from Goshective +author: John Snow Labs +name: kazparc_russian_english_model_1_goshective +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazparc_russian_english_model_1_goshective` is a English model originally trained by Goshective. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_goshective_en_5.5.1_3.0_1734328766385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_goshective_en_5.5.1_3.0_1734328766385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("kazparc_russian_english_model_1_goshective","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("kazparc_russian_english_model_1_goshective", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazparc_russian_english_model_1_goshective| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|343.0 MB| + +## References + +https://huggingface.co/Goshective/kazparc_ru_en_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-kazparc_russian_english_model_1_goshective_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-kazparc_russian_english_model_1_goshective_pipeline_en.md new file mode 100644 index 00000000000000..c41c56ad98af82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-kazparc_russian_english_model_1_goshective_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English kazparc_russian_english_model_1_goshective_pipeline pipeline T5Transformer from Goshective +author: John Snow Labs +name: kazparc_russian_english_model_1_goshective_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazparc_russian_english_model_1_goshective_pipeline` is a English model originally trained by Goshective. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_goshective_pipeline_en_5.5.1_3.0_1734328785642.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazparc_russian_english_model_1_goshective_pipeline_en_5.5.1_3.0_1734328785642.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("kazparc_russian_english_model_1_goshective_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("kazparc_russian_english_model_1_goshective_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazparc_russian_english_model_1_goshective_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|343.0 MB| + +## References + +https://huggingface.co/Goshective/kazparc_ru_en_model_1 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-keyword_summarizer_10000_v2_en.md b/docs/_posts/ahmedlone127/2024-12-16-keyword_summarizer_10000_v2_en.md new file mode 100644 index 00000000000000..be0df7341e923b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-keyword_summarizer_10000_v2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English keyword_summarizer_10000_v2 T5Transformer from ZephyrUtopia +author: John Snow Labs +name: keyword_summarizer_10000_v2 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_summarizer_10000_v2` is a English model originally trained by ZephyrUtopia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_summarizer_10000_v2_en_5.5.1_3.0_1734328982465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_summarizer_10000_v2_en_5.5.1_3.0_1734328982465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("keyword_summarizer_10000_v2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("keyword_summarizer_10000_v2", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_summarizer_10000_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ZephyrUtopia/keyword-summarizer-10000-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-keyword_summarizer_10000_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-keyword_summarizer_10000_v2_pipeline_en.md new file mode 100644 index 00000000000000..162e1f5d94a774 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-keyword_summarizer_10000_v2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English keyword_summarizer_10000_v2_pipeline pipeline T5Transformer from ZephyrUtopia +author: John Snow Labs +name: keyword_summarizer_10000_v2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_summarizer_10000_v2_pipeline` is a English model originally trained by ZephyrUtopia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_summarizer_10000_v2_pipeline_en_5.5.1_3.0_1734329032640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_summarizer_10000_v2_pipeline_en_5.5.1_3.0_1734329032640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("keyword_summarizer_10000_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("keyword_summarizer_10000_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_summarizer_10000_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ZephyrUtopia/keyword-summarizer-10000-v2 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-lora_medical_flan_t5_small_en.md b/docs/_posts/ahmedlone127/2024-12-16-lora_medical_flan_t5_small_en.md new file mode 100644 index 00000000000000..a3ecee0a6c5aff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-lora_medical_flan_t5_small_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English lora_medical_flan_t5_small T5Transformer from Yudsky +author: John Snow Labs +name: lora_medical_flan_t5_small +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lora_medical_flan_t5_small` is a English model originally trained by Yudsky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lora_medical_flan_t5_small_en_5.5.1_3.0_1734332647878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lora_medical_flan_t5_small_en_5.5.1_3.0_1734332647878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("lora_medical_flan_t5_small","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("lora_medical_flan_t5_small", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lora_medical_flan_t5_small| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/Yudsky/lora-Medical-flan-T5-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-lora_medical_flan_t5_small_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-lora_medical_flan_t5_small_pipeline_en.md new file mode 100644 index 00000000000000..fce00083aabda1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-lora_medical_flan_t5_small_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English lora_medical_flan_t5_small_pipeline pipeline T5Transformer from Yudsky +author: John Snow Labs +name: lora_medical_flan_t5_small_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lora_medical_flan_t5_small_pipeline` is a English model originally trained by Yudsky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lora_medical_flan_t5_small_pipeline_en_5.5.1_3.0_1734332668025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lora_medical_flan_t5_small_pipeline_en_5.5.1_3.0_1734332668025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("lora_medical_flan_t5_small_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("lora_medical_flan_t5_small_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lora_medical_flan_t5_small_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|349.8 MB| + +## References + +https://huggingface.co/Yudsky/lora-Medical-flan-T5-small + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qnli_10_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qnli_10_en.md new file mode 100644 index 00000000000000..c6af3a793bce70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qnli_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mdeberta_v3_base_qnli_10 DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_qnli_10 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_qnli_10` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qnli_10_en_5.5.1_3.0_1734313814273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qnli_10_en_5.5.1_3.0_1734313814273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_qnli_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_qnli_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_qnli_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|828.6 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-qnli-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qnli_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qnli_10_pipeline_en.md new file mode 100644 index 00000000000000..3fe2b8118c8121 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qnli_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mdeberta_v3_base_qnli_10_pipeline pipeline DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_qnli_10_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_qnli_10_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qnli_10_pipeline_en_5.5.1_3.0_1734313939566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qnli_10_pipeline_en_5.5.1_3.0_1734313939566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mdeberta_v3_base_qnli_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mdeberta_v3_base_qnli_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_qnli_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|828.6 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-qnli-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qqp_10_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qqp_10_en.md new file mode 100644 index 00000000000000..f05b678228de8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qqp_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mdeberta_v3_base_qqp_10 DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_qqp_10 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_qqp_10` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qqp_10_en_5.5.1_3.0_1734311800041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qqp_10_en_5.5.1_3.0_1734311800041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_qqp_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_qqp_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_qqp_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|834.7 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-qqp-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qqp_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qqp_10_pipeline_en.md new file mode 100644 index 00000000000000..caa95ce59ec7e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_qqp_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mdeberta_v3_base_qqp_10_pipeline pipeline DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_qqp_10_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_qqp_10_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qqp_10_pipeline_en_5.5.1_3.0_1734311924605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_qqp_10_pipeline_en_5.5.1_3.0_1734311924605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mdeberta_v3_base_qqp_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mdeberta_v3_base_qqp_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_qqp_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|834.8 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-qqp-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_quality_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_quality_en.md new file mode 100644 index 00000000000000..1850ef271297d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_quality_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mdeberta_v3_base_quality DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: mdeberta_v3_base_quality +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_quality` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_quality_en_5.5.1_3.0_1734312256410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_quality_en_5.5.1_3.0_1734312256410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_quality","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_quality", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_quality| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|867.1 MB| + +## References + +https://huggingface.co/agentlans/mdeberta-v3-base-quality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_quality_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_quality_pipeline_en.md new file mode 100644 index 00000000000000..29d11df505442c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_quality_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mdeberta_v3_base_quality_pipeline pipeline DeBertaForSequenceClassification from agentlans +author: John Snow Labs +name: mdeberta_v3_base_quality_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_quality_pipeline` is a English model originally trained by agentlans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_quality_pipeline_en_5.5.1_3.0_1734312353608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_quality_pipeline_en_5.5.1_3.0_1734312353608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mdeberta_v3_base_quality_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mdeberta_v3_base_quality_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_quality_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|867.1 MB| + +## References + +https://huggingface.co/agentlans/mdeberta-v3-base-quality + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_sst2_1_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_sst2_1_en.md new file mode 100644 index 00000000000000..846effa8d4d696 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_sst2_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mdeberta_v3_base_sst2_1 DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_sst2_1 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_sst2_1` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_sst2_1_en_5.5.1_3.0_1734312703809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_sst2_1_en_5.5.1_3.0_1734312703809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_sst2_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_sst2_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_sst2_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|785.6 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-sst2-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_sst2_1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_sst2_1_pipeline_en.md new file mode 100644 index 00000000000000..44c42564bfd250 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_sst2_1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mdeberta_v3_base_sst2_1_pipeline pipeline DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_sst2_1_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_sst2_1_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_sst2_1_pipeline_en_5.5.1_3.0_1734312848603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_sst2_1_pipeline_en_5.5.1_3.0_1734312848603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mdeberta_v3_base_sst2_1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mdeberta_v3_base_sst2_1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_sst2_1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|785.6 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-sst2-1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vnrte_10_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vnrte_10_en.md new file mode 100644 index 00000000000000..f34132c829e61f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vnrte_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mdeberta_v3_base_vnrte_10 DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_vnrte_10 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_vnrte_10` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vnrte_10_en_5.5.1_3.0_1734313190542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vnrte_10_en_5.5.1_3.0_1734313190542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_vnrte_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_vnrte_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_vnrte_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|794.9 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-vnrte-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vnrte_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vnrte_10_pipeline_en.md new file mode 100644 index 00000000000000..4309d41021b4d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vnrte_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mdeberta_v3_base_vnrte_10_pipeline pipeline DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_vnrte_10_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_vnrte_10_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vnrte_10_pipeline_en_5.5.1_3.0_1734313330468.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vnrte_10_pipeline_en_5.5.1_3.0_1734313330468.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mdeberta_v3_base_vnrte_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mdeberta_v3_base_vnrte_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_vnrte_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|794.9 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-vnrte-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vtoc_10_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vtoc_10_en.md new file mode 100644 index 00000000000000..b61cbc4d4750fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vtoc_10_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English mdeberta_v3_base_vtoc_10 DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_vtoc_10 +date: 2024-12-16 +tags: [en, open_source, onnx, sequence_classification, deberta] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DeBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_vtoc_10` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vtoc_10_en_5.5.1_3.0_1734312635411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vtoc_10_en_5.5.1_3.0_1734312635411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_vtoc_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("class") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier = DeBertaForSequenceClassification.pretrained("mdeberta_v3_base_vtoc_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_vtoc_10| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|789.5 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-vtoc-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vtoc_10_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vtoc_10_pipeline_en.md new file mode 100644 index 00000000000000..f457cccb70c916 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mdeberta_v3_base_vtoc_10_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English mdeberta_v3_base_vtoc_10_pipeline pipeline DeBertaForSequenceClassification from tmnam20 +author: John Snow Labs +name: mdeberta_v3_base_vtoc_10_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DeBertaForSequenceClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdeberta_v3_base_vtoc_10_pipeline` is a English model originally trained by tmnam20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vtoc_10_pipeline_en_5.5.1_3.0_1734312780878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdeberta_v3_base_vtoc_10_pipeline_en_5.5.1_3.0_1734312780878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mdeberta_v3_base_vtoc_10_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mdeberta_v3_base_vtoc_10_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdeberta_v3_base_vtoc_10_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|789.5 MB| + +## References + +https://huggingface.co/tmnam20/mdeberta-v3-base-vtoc-10 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DeBertaForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-meetingbank_qa_summary_en.md b/docs/_posts/ahmedlone127/2024-12-16-meetingbank_qa_summary_en.md new file mode 100644 index 00000000000000..6e97ba974840cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-meetingbank_qa_summary_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English meetingbank_qa_summary T5Transformer from zu4425 +author: John Snow Labs +name: meetingbank_qa_summary +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`meetingbank_qa_summary` is a English model originally trained by zu4425. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/meetingbank_qa_summary_en_5.5.1_3.0_1734332803490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/meetingbank_qa_summary_en_5.5.1_3.0_1734332803490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("meetingbank_qa_summary","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("meetingbank_qa_summary", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|meetingbank_qa_summary| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|284.4 MB| + +## References + +https://huggingface.co/zu4425/MeetingBank-QA-Summary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-meetingbank_qa_summary_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-meetingbank_qa_summary_pipeline_en.md new file mode 100644 index 00000000000000..9eaaad55e1ac5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-meetingbank_qa_summary_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English meetingbank_qa_summary_pipeline pipeline T5Transformer from zu4425 +author: John Snow Labs +name: meetingbank_qa_summary_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`meetingbank_qa_summary_pipeline` is a English model originally trained by zu4425. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/meetingbank_qa_summary_pipeline_en_5.5.1_3.0_1734332832265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/meetingbank_qa_summary_pipeline_en_5.5.1_3.0_1734332832265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("meetingbank_qa_summary_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("meetingbank_qa_summary_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|meetingbank_qa_summary_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|284.4 MB| + +## References + +https://huggingface.co/zu4425/MeetingBank-QA-Summary + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-modeldigitoun_en.md b/docs/_posts/ahmedlone127/2024-12-16-modeldigitoun_en.md new file mode 100644 index 00000000000000..b089f0ea9b607f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-modeldigitoun_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English modeldigitoun T5Transformer from Digitoun +author: John Snow Labs +name: modeldigitoun +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modeldigitoun` is a English model originally trained by Digitoun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modeldigitoun_en_5.5.1_3.0_1734327433876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modeldigitoun_en_5.5.1_3.0_1734327433876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("modeldigitoun","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("modeldigitoun", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modeldigitoun| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|339.3 MB| + +## References + +https://huggingface.co/Digitoun/modeldigitoun \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-modeldigitoun_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-modeldigitoun_pipeline_en.md new file mode 100644 index 00000000000000..9968f4a68a3743 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-modeldigitoun_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English modeldigitoun_pipeline pipeline T5Transformer from Digitoun +author: John Snow Labs +name: modeldigitoun_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modeldigitoun_pipeline` is a English model originally trained by Digitoun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modeldigitoun_pipeline_en_5.5.1_3.0_1734327453600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modeldigitoun_pipeline_en_5.5.1_3.0_1734327453600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("modeldigitoun_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("modeldigitoun_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modeldigitoun_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|339.3 MB| + +## References + +https://huggingface.co/Digitoun/modeldigitoun + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_base_english_wiki_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_english_wiki_pipeline_en.md new file mode 100644 index 00000000000000..7d9434f665e93a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_english_wiki_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_english_wiki_pipeline pipeline T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_english_wiki_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_english_wiki_pipeline` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_english_wiki_pipeline_en_5.5.1_3.0_1734331499520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_english_wiki_pipeline_en_5.5.1_3.0_1734331499520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_english_wiki_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_english_wiki_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_english_wiki_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5_base_EN_wiki + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_base_qa_v2_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_qa_v2_en.md new file mode 100644 index 00000000000000..bc88ff744e0e16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_qa_v2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_base_qa_v2 T5Transformer from hawalurahman +author: John Snow Labs +name: mt5_base_qa_v2 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_qa_v2` is a English model originally trained by hawalurahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_qa_v2_en_5.5.1_3.0_1734333550333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_qa_v2_en_5.5.1_3.0_1734333550333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_base_qa_v2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_base_qa_v2", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_qa_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|2.3 GB| + +## References + +https://huggingface.co/hawalurahman/mt5-base-qa_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_base_qa_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_qa_v2_pipeline_en.md new file mode 100644 index 00000000000000..1ce97c1942a963 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_qa_v2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_qa_v2_pipeline pipeline T5Transformer from hawalurahman +author: John Snow Labs +name: mt5_base_qa_v2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_qa_v2_pipeline` is a English model originally trained by hawalurahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_qa_v2_pipeline_en_5.5.1_3.0_1734333735101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_qa_v2_pipeline_en_5.5.1_3.0_1734333735101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_qa_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_qa_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_qa_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.3 GB| + +## References + +https://huggingface.co/hawalurahman/mt5-base-qa_v2 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_base_thai_wiki_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_thai_wiki_en.md new file mode 100644 index 00000000000000..5eeaacf2467922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_thai_wiki_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_base_thai_wiki T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_thai_wiki +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_thai_wiki` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_thai_wiki_en_5.5.1_3.0_1734330975850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_thai_wiki_en_5.5.1_3.0_1734330975850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_base_thai_wiki","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_base_thai_wiki", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_thai_wiki| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5_base_TH_wiki \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_base_thai_wiki_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_thai_wiki_pipeline_en.md new file mode 100644 index 00000000000000..8108cf22d18762 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_base_thai_wiki_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_base_thai_wiki_pipeline pipeline T5Transformer from e22vvb +author: John Snow Labs +name: mt5_base_thai_wiki_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_base_thai_wiki_pipeline` is a English model originally trained by e22vvb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_base_thai_wiki_pipeline_en_5.5.1_3.0_1734331477725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_base_thai_wiki_pipeline_en_5.5.1_3.0_1734331477725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_base_thai_wiki_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_base_thai_wiki_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_base_thai_wiki_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/e22vvb/mt5_base_TH_wiki + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_bleu_durga_q1_clean_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_bleu_durga_q1_clean_en.md new file mode 100644 index 00000000000000..30d64bbb8be6ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_bleu_durga_q1_clean_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_bleu_durga_q1_clean T5Transformer from devagonal +author: John Snow Labs +name: mt5_bleu_durga_q1_clean +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_bleu_durga_q1_clean` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_bleu_durga_q1_clean_en_5.5.1_3.0_1734332199442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_bleu_durga_q1_clean_en_5.5.1_3.0_1734332199442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_bleu_durga_q1_clean","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_bleu_durga_q1_clean", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_bleu_durga_q1_clean| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|2.2 GB| + +## References + +https://huggingface.co/devagonal/mt5-bleu-durga-q1-clean \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_bleu_durga_q1_clean_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_bleu_durga_q1_clean_pipeline_en.md new file mode 100644 index 00000000000000..38ae5ebc4a945b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_bleu_durga_q1_clean_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_bleu_durga_q1_clean_pipeline pipeline T5Transformer from devagonal +author: John Snow Labs +name: mt5_bleu_durga_q1_clean_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_bleu_durga_q1_clean_pipeline` is a English model originally trained by devagonal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_bleu_durga_q1_clean_pipeline_en_5.5.1_3.0_1734332410606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_bleu_durga_q1_clean_pipeline_en_5.5.1_3.0_1734332410606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_bleu_durga_q1_clean_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_bleu_durga_q1_clean_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_bleu_durga_q1_clean_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|2.2 GB| + +## References + +https://huggingface.co/devagonal/mt5-bleu-durga-q1-clean + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_small_finetuned_xsum_guan06_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_finetuned_xsum_guan06_en.md new file mode 100644 index 00000000000000..0a0df2c41eabda --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_finetuned_xsum_guan06_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_small_finetuned_xsum_guan06 T5Transformer from guan06 +author: John Snow Labs +name: mt5_small_finetuned_xsum_guan06 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_finetuned_xsum_guan06` is a English model originally trained by guan06. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_finetuned_xsum_guan06_en_5.5.1_3.0_1734328284294.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_finetuned_xsum_guan06_en_5.5.1_3.0_1734328284294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_small_finetuned_xsum_guan06","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_small_finetuned_xsum_guan06", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_finetuned_xsum_guan06| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|819.8 MB| + +## References + +https://huggingface.co/guan06/mt5-small-finetuned-xsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_small_finetuned_xsum_guan06_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_finetuned_xsum_guan06_pipeline_en.md new file mode 100644 index 00000000000000..13e5fab8747a57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_finetuned_xsum_guan06_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_small_finetuned_xsum_guan06_pipeline pipeline T5Transformer from guan06 +author: John Snow Labs +name: mt5_small_finetuned_xsum_guan06_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_finetuned_xsum_guan06_pipeline` is a English model originally trained by guan06. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_finetuned_xsum_guan06_pipeline_en_5.5.1_3.0_1734328550461.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_finetuned_xsum_guan06_pipeline_en_5.5.1_3.0_1734328550461.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_small_finetuned_xsum_guan06_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_small_finetuned_xsum_guan06_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_finetuned_xsum_guan06_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|819.8 MB| + +## References + +https://huggingface.co/guan06/mt5-small-finetuned-xsum + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_small_ganda_inf_english_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_ganda_inf_english_en.md new file mode 100644 index 00000000000000..890f371fe63fb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_ganda_inf_english_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_small_ganda_inf_english T5Transformer from MubarakB +author: John Snow Labs +name: mt5_small_ganda_inf_english +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_ganda_inf_english` is a English model originally trained by MubarakB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_ganda_inf_english_en_5.5.1_3.0_1734333352921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_ganda_inf_english_en_5.5.1_3.0_1734333352921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_small_ganda_inf_english","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_small_ganda_inf_english", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_ganda_inf_english| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|333.6 MB| + +## References + +https://huggingface.co/MubarakB/mt5_small_lg_inf_en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_small_ganda_inf_english_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_ganda_inf_english_pipeline_en.md new file mode 100644 index 00000000000000..a9f78f1fb3a58f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_small_ganda_inf_english_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_small_ganda_inf_english_pipeline pipeline T5Transformer from MubarakB +author: John Snow Labs +name: mt5_small_ganda_inf_english_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_small_ganda_inf_english_pipeline` is a English model originally trained by MubarakB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_small_ganda_inf_english_pipeline_en_5.5.1_3.0_1734333373521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_small_ganda_inf_english_pipeline_en_5.5.1_3.0_1734333373521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_small_ganda_inf_english_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_small_ganda_inf_english_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_small_ganda_inf_english_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|333.6 MB| + +## References + +https://huggingface.co/MubarakB/mt5_small_lg_inf_en + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_summarize_arabic_english_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_summarize_arabic_english_en.md new file mode 100644 index 00000000000000..e93cf40a5acc58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_summarize_arabic_english_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English mt5_summarize_arabic_english T5Transformer from YoussefAnwar +author: John Snow Labs +name: mt5_summarize_arabic_english +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_summarize_arabic_english` is a English model originally trained by YoussefAnwar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_summarize_arabic_english_en_5.5.1_3.0_1734327884653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_summarize_arabic_english_en_5.5.1_3.0_1734327884653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("mt5_summarize_arabic_english","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("mt5_summarize_arabic_english", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_summarize_arabic_english| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/YoussefAnwar/mt5-summarize-ar-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-mt5_summarize_arabic_english_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-mt5_summarize_arabic_english_pipeline_en.md new file mode 100644 index 00000000000000..bc7662f740e189 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-mt5_summarize_arabic_english_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English mt5_summarize_arabic_english_pipeline pipeline T5Transformer from YoussefAnwar +author: John Snow Labs +name: mt5_summarize_arabic_english_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mt5_summarize_arabic_english_pipeline` is a English model originally trained by YoussefAnwar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mt5_summarize_arabic_english_pipeline_en_5.5.1_3.0_1734327971896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mt5_summarize_arabic_english_pipeline_en_5.5.1_3.0_1734327971896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("mt5_summarize_arabic_english_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("mt5_summarize_arabic_english_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mt5_summarize_arabic_english_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/YoussefAnwar/mt5-summarize-ar-en + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-multi_qa_mpnet_base_dot_v1_covidqa_search_en.md b/docs/_posts/ahmedlone127/2024-12-16-multi_qa_mpnet_base_dot_v1_covidqa_search_en.md new file mode 100644 index 00000000000000..53cfd909294725 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-multi_qa_mpnet_base_dot_v1_covidqa_search_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English multi_qa_mpnet_base_dot_v1_covidqa_search MPNetEmbeddings from checkiejan +author: John Snow Labs +name: multi_qa_mpnet_base_dot_v1_covidqa_search +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_qa_mpnet_base_dot_v1_covidqa_search` is a English model originally trained by checkiejan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_covidqa_search_en_5.5.1_3.0_1734316408749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_covidqa_search_en_5.5.1_3.0_1734316408749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("multi_qa_mpnet_base_dot_v1_covidqa_search","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("multi_qa_mpnet_base_dot_v1_covidqa_search","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_qa_mpnet_base_dot_v1_covidqa_search| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/checkiejan/multi-qa-mpnet-base-dot-v1-covidqa-search \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline_en.md new file mode 100644 index 00000000000000..fe51ab08070526 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline pipeline MPNetEmbeddings from checkiejan +author: John Snow Labs +name: multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline` is a English model originally trained by checkiejan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline_en_5.5.1_3.0_1734316446482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline_en_5.5.1_3.0_1734316446482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_qa_mpnet_base_dot_v1_covidqa_search_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/checkiejan/multi-qa-mpnet-base-dot-v1-covidqa-search + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-name_entity_recognizer_en.md b/docs/_posts/ahmedlone127/2024-12-16-name_entity_recognizer_en.md new file mode 100644 index 00000000000000..03055661b4500e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-name_entity_recognizer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English name_entity_recognizer DistilBertForTokenClassification from SKT27182 +author: John Snow Labs +name: name_entity_recognizer +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`name_entity_recognizer` is a English model originally trained by SKT27182. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/name_entity_recognizer_en_5.5.1_3.0_1734310411372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/name_entity_recognizer_en_5.5.1_3.0_1734310411372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("name_entity_recognizer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("name_entity_recognizer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|name_entity_recognizer| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SKT27182/Name_Entity_Recognizer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-name_entity_recognizer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-name_entity_recognizer_pipeline_en.md new file mode 100644 index 00000000000000..631b9245f79d9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-name_entity_recognizer_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English name_entity_recognizer_pipeline pipeline DistilBertForTokenClassification from SKT27182 +author: John Snow Labs +name: name_entity_recognizer_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`name_entity_recognizer_pipeline` is a English model originally trained by SKT27182. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/name_entity_recognizer_pipeline_en_5.5.1_3.0_1734310429931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/name_entity_recognizer_pipeline_en_5.5.1_3.0_1734310429931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("name_entity_recognizer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("name_entity_recognizer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|name_entity_recognizer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SKT27182/Name_Entity_Recognizer + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline_en.md new file mode 100644 index 00000000000000..93dcad69dae9d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline pipeline BertForTokenClassification from raulgdp +author: John Snow Labs +name: ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline` is a English model originally trained by raulgdp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline_en_5.5.1_3.0_1734337218469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline_en_5.5.1_3.0_1734337218469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_fine_tuned_beto_finetuned_ner_raulgdp_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/raulgdp/NER-fine-tuned-BETO-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_model_2_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_model_2_en.md new file mode 100644 index 00000000000000..fa840298f4c118 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_model_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ner_model_2 DistilBertForTokenClassification from Rizzler-gyatt-69 +author: John Snow Labs +name: ner_model_2 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_2` is a English model originally trained by Rizzler-gyatt-69. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_2_en_5.5.1_3.0_1734310209734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_2_en_5.5.1_3.0_1734310209734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_model_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("ner_model_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Rizzler-gyatt-69/ner_model_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_model_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_model_2_pipeline_en.md new file mode 100644 index 00000000000000..ceb5947a28f30c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_model_2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ner_model_2_pipeline pipeline DistilBertForTokenClassification from Rizzler-gyatt-69 +author: John Snow Labs +name: ner_model_2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_2_pipeline` is a English model originally trained by Rizzler-gyatt-69. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_2_pipeline_en_5.5.1_3.0_1734310222520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_2_pipeline_en_5.5.1_3.0_1734310222520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ner_model_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ner_model_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Rizzler-gyatt-69/ner_model_2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_model_3_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_model_3_en.md new file mode 100644 index 00000000000000..efd3f2afa2dbc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_model_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English ner_model_3 DistilBertForTokenClassification from Rizzler-gyatt-69 +author: John Snow Labs +name: ner_model_3 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_3` is a English model originally trained by Rizzler-gyatt-69. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_3_en_5.5.1_3.0_1734310980125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_3_en_5.5.1_3.0_1734310980125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_model_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("ner_model_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Rizzler-gyatt-69/ner_model_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_model_3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_model_3_pipeline_en.md new file mode 100644 index 00000000000000..a53d0e2ebf554a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_model_3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English ner_model_3_pipeline pipeline DistilBertForTokenClassification from Rizzler-gyatt-69 +author: John Snow Labs +name: ner_model_3_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_3_pipeline` is a English model originally trained by Rizzler-gyatt-69. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_3_pipeline_en_5.5.1_3.0_1734310992837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_3_pipeline_en_5.5.1_3.0_1734310992837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("ner_model_3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("ner_model_3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Rizzler-gyatt-69/ner_model_3 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_model_en.md new file mode 100644 index 00000000000000..b438b319bc3c10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_model_en.md @@ -0,0 +1,96 @@ +--- +layout: model +title: English ner_model BertForTokenClassification from MichaelSargious +author: John Snow Labs +name: ner_model +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model` is a English model originally trained by MichaelSargious. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_en_5.5.1_3.0_1734310900283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_en_5.5.1_3.0_1734310900283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +References + +https://huggingface.co/MichaelSargious/ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-ner_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-ner_model_pipeline_en.md new file mode 100644 index 00000000000000..96ea280c954d67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-ner_model_pipeline_en.md @@ -0,0 +1,72 @@ +--- +layout: model +title: English ner_model_pipeline pipeline BertForTokenClassification from MichaelSargious +author: John Snow Labs +name: ner_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_pipeline` is a English model originally trained by MichaelSargious. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_pipeline_en_5.5.1_3.0_1734310913111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_pipeline_en_5.5.1_3.0_1734310913111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("ner_model_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("ner_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +References + +https://huggingface.co/MichaelSargious/ner_model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-nooks_amd_detection_en.md b/docs/_posts/ahmedlone127/2024-12-16-nooks_amd_detection_en.md new file mode 100644 index 00000000000000..4565b3f2d86b38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-nooks_amd_detection_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English nooks_amd_detection MPNetEmbeddings from nikcheerla +author: John Snow Labs +name: nooks_amd_detection +date: 2024-12-16 +tags: [en, open_source, onnx, embeddings, mpnet] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: MPNetEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nooks_amd_detection` is a English model originally trained by nikcheerla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nooks_amd_detection_en_5.5.1_3.0_1734316408758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nooks_amd_detection_en_5.5.1_3.0_1734316408758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +embeddings = MPNetEmbeddings.pretrained("nooks_amd_detection","en") \ + .setInputCols(["document"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([documentAssembler, embeddings]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val embeddings = MPNetEmbeddings.pretrained("nooks_amd_detection","en") + .setInputCols(Array("document")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings)) +val data = Seq("I love spark-nlp").toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nooks_amd_detection| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[mpnet]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/nikcheerla/nooks-amd-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-nooks_amd_detection_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-nooks_amd_detection_pipeline_en.md new file mode 100644 index 00000000000000..e2f9efe5cccced --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-nooks_amd_detection_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English nooks_amd_detection_pipeline pipeline MPNetEmbeddings from nikcheerla +author: John Snow Labs +name: nooks_amd_detection_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained MPNetEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nooks_amd_detection_pipeline` is a English model originally trained by nikcheerla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nooks_amd_detection_pipeline_en_5.5.1_3.0_1734316440637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nooks_amd_detection_pipeline_en_5.5.1_3.0_1734316440637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("nooks_amd_detection_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("nooks_amd_detection_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nooks_amd_detection_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/nikcheerla/nooks-amd-detection + +## Included Models + +- DocumentAssembler +- MPNetEmbeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-opus_books_model_french_en.md b/docs/_posts/ahmedlone127/2024-12-16-opus_books_model_french_en.md new file mode 100644 index 00000000000000..ae171bb22f7bef --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-opus_books_model_french_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English opus_books_model_french T5Transformer from Goshective +author: John Snow Labs +name: opus_books_model_french +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_books_model_french` is a English model originally trained by Goshective. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_books_model_french_en_5.5.1_3.0_1734328390494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_books_model_french_en_5.5.1_3.0_1734328390494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("opus_books_model_french","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("opus_books_model_french", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opus_books_model_french| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|340.4 MB| + +## References + +https://huggingface.co/Goshective/opus_books_model_french \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-opus_books_model_french_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-opus_books_model_french_pipeline_en.md new file mode 100644 index 00000000000000..68c777c77f71ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-opus_books_model_french_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English opus_books_model_french_pipeline pipeline T5Transformer from Goshective +author: John Snow Labs +name: opus_books_model_french_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opus_books_model_french_pipeline` is a English model originally trained by Goshective. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opus_books_model_french_pipeline_en_5.5.1_3.0_1734328410890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opus_books_model_french_pipeline_en_5.5.1_3.0_1734328410890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("opus_books_model_french_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("opus_books_model_french_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opus_books_model_french_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|340.4 MB| + +## References + +https://huggingface.co/Goshective/opus_books_model_french + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-osa_custom_ner_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-osa_custom_ner_model_en.md new file mode 100644 index 00000000000000..bd8e8ba7ee0194 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-osa_custom_ner_model_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English osa_custom_ner_model DistilBertForTokenClassification from AnanthanarayananSeetharaman +author: John Snow Labs +name: osa_custom_ner_model +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`osa_custom_ner_model` is a English model originally trained by AnanthanarayananSeetharaman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/osa_custom_ner_model_en_5.5.1_3.0_1734310622553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/osa_custom_ner_model_en_5.5.1_3.0_1734310622553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("osa_custom_ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("osa_custom_ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|osa_custom_ner_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AnanthanarayananSeetharaman/osa-custom-ner-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-osa_custom_ner_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-osa_custom_ner_model_pipeline_en.md new file mode 100644 index 00000000000000..105555ae5ef362 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-osa_custom_ner_model_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English osa_custom_ner_model_pipeline pipeline DistilBertForTokenClassification from AnanthanarayananSeetharaman +author: John Snow Labs +name: osa_custom_ner_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`osa_custom_ner_model_pipeline` is a English model originally trained by AnanthanarayananSeetharaman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/osa_custom_ner_model_pipeline_en_5.5.1_3.0_1734310635346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/osa_custom_ner_model_pipeline_en_5.5.1_3.0_1734310635346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("osa_custom_ner_model_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("osa_custom_ner_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|osa_custom_ner_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AnanthanarayananSeetharaman/osa-custom-ner-model + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-pii_detection_v2_1_en.md b/docs/_posts/ahmedlone127/2024-12-16-pii_detection_v2_1_en.md new file mode 100644 index 00000000000000..fecf4546defcd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-pii_detection_v2_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English pii_detection_v2_1 DistilBertForTokenClassification from deepaksiloka +author: John Snow Labs +name: pii_detection_v2_1 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pii_detection_v2_1` is a English model originally trained by deepaksiloka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pii_detection_v2_1_en_5.5.1_3.0_1734310762658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pii_detection_v2_1_en_5.5.1_3.0_1734310762658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("pii_detection_v2_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("pii_detection_v2_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pii_detection_v2_1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/deepaksiloka/PII-Detection-V2.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-pii_detector_ai4privacy_pipeline_xx.md b/docs/_posts/ahmedlone127/2024-12-16-pii_detector_ai4privacy_pipeline_xx.md new file mode 100644 index 00000000000000..6b632f4752b2df --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-pii_detector_ai4privacy_pipeline_xx.md @@ -0,0 +1,70 @@ +--- +layout: model +title: Multilingual pii_detector_ai4privacy_pipeline pipeline DistilBertForTokenClassification from molise-ai +author: John Snow Labs +name: pii_detector_ai4privacy_pipeline +date: 2024-12-16 +tags: [xx, open_source, pipeline, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pii_detector_ai4privacy_pipeline` is a Multilingual model originally trained by molise-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pii_detector_ai4privacy_pipeline_xx_5.5.1_3.0_1734310411223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pii_detector_ai4privacy_pipeline_xx_5.5.1_3.0_1734310411223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("pii_detector_ai4privacy_pipeline", lang = "xx") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("pii_detector_ai4privacy_pipeline", lang = "xx") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pii_detector_ai4privacy_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|xx| +|Size:|505.5 MB| + +## References + +https://huggingface.co/molise-ai/pii-detector-ai4privacy + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-pii_detector_ai4privacy_xx.md b/docs/_posts/ahmedlone127/2024-12-16-pii_detector_ai4privacy_xx.md new file mode 100644 index 00000000000000..16948b4ad89d6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-pii_detector_ai4privacy_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual pii_detector_ai4privacy DistilBertForTokenClassification from molise-ai +author: John Snow Labs +name: pii_detector_ai4privacy +date: 2024-12-16 +tags: [xx, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pii_detector_ai4privacy` is a Multilingual model originally trained by molise-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pii_detector_ai4privacy_xx_5.5.1_3.0_1734310385556.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pii_detector_ai4privacy_xx_5.5.1_3.0_1734310385556.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("pii_detector_ai4privacy","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("pii_detector_ai4privacy", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pii_detector_ai4privacy| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.5 MB| + +## References + +https://huggingface.co/molise-ai/pii-detector-ai4privacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-plant_classification_en.md b/docs/_posts/ahmedlone127/2024-12-16-plant_classification_en.md new file mode 100644 index 00000000000000..df815ae7057604 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-plant_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English plant_classification SwinForImageClassification from brigettesegovia +author: John Snow Labs +name: plant_classification +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`plant_classification` is a English model originally trained by brigettesegovia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/plant_classification_en_5.5.1_3.0_1734325133716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/plant_classification_en_5.5.1_3.0_1734325133716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""plant_classification","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("plant_classification","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|plant_classification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/brigettesegovia/plant_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-plant_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-plant_classification_pipeline_en.md new file mode 100644 index 00000000000000..37ff16eb1f5342 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-plant_classification_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English plant_classification_pipeline pipeline SwinForImageClassification from brigettesegovia +author: John Snow Labs +name: plant_classification_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`plant_classification_pipeline` is a English model originally trained by brigettesegovia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/plant_classification_pipeline_en_5.5.1_3.0_1734325144283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/plant_classification_pipeline_en_5.5.1_3.0_1734325144283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("plant_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("plant_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|plant_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/brigettesegovia/plant_classification + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-pruebamodelotfm_bert_spanish_en.md b/docs/_posts/ahmedlone127/2024-12-16-pruebamodelotfm_bert_spanish_en.md new file mode 100644 index 00000000000000..67bdb7f9519406 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-pruebamodelotfm_bert_spanish_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English pruebamodelotfm_bert_spanish BertForQuestionAnswering from pamelapaolacb +author: John Snow Labs +name: pruebamodelotfm_bert_spanish +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruebamodelotfm_bert_spanish` is a English model originally trained by pamelapaolacb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruebamodelotfm_bert_spanish_en_5.5.1_3.0_1734338359831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruebamodelotfm_bert_spanish_en_5.5.1_3.0_1734338359831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("pruebamodelotfm_bert_spanish","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("pruebamodelotfm_bert_spanish", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruebamodelotfm_bert_spanish| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/pamelapaolacb/pruebaModeloTFM_Bert_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-qthang_finetuned_2_en.md b/docs/_posts/ahmedlone127/2024-12-16-qthang_finetuned_2_en.md new file mode 100644 index 00000000000000..d5b62ce8557425 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-qthang_finetuned_2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English qthang_finetuned_2 BertForQuestionAnswering from ThangDinh +author: John Snow Labs +name: qthang_finetuned_2 +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qthang_finetuned_2` is a English model originally trained by ThangDinh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qthang_finetuned_2_en_5.5.1_3.0_1734338614932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qthang_finetuned_2_en_5.5.1_3.0_1734338614932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("qthang_finetuned_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("qthang_finetuned_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qthang_finetuned_2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ThangDinh/qthang-finetuned-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-qthang_finetuned_2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-qthang_finetuned_2_pipeline_en.md new file mode 100644 index 00000000000000..b90e775e6d9654 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-qthang_finetuned_2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English qthang_finetuned_2_pipeline pipeline BertForQuestionAnswering from ThangDinh +author: John Snow Labs +name: qthang_finetuned_2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qthang_finetuned_2_pipeline` is a English model originally trained by ThangDinh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qthang_finetuned_2_pipeline_en_5.5.1_3.0_1734338636339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qthang_finetuned_2_pipeline_en_5.5.1_3.0_1734338636339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("qthang_finetuned_2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("qthang_finetuned_2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qthang_finetuned_2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ThangDinh/qthang-finetuned-2 + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-question_model_2data_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-question_model_2data_pipeline_en.md new file mode 100644 index 00000000000000..d601a316350ec5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-question_model_2data_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English question_model_2data_pipeline pipeline T5Transformer from kikaigakushuu +author: John Snow Labs +name: question_model_2data_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_model_2data_pipeline` is a English model originally trained by kikaigakushuu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_model_2data_pipeline_en_5.5.1_3.0_1734333206126.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_model_2data_pipeline_en_5.5.1_3.0_1734333206126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("question_model_2data_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("question_model_2data_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_model_2data_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/kikaigakushuu/Question_model_2data + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-results_bbolint_en.md b/docs/_posts/ahmedlone127/2024-12-16-results_bbolint_en.md new file mode 100644 index 00000000000000..0f9085e1361294 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-results_bbolint_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English results_bbolint DistilBertForTokenClassification from bbolint +author: John Snow Labs +name: results_bbolint +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_bbolint` is a English model originally trained by bbolint. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_bbolint_en_5.5.1_3.0_1734310505384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_bbolint_en_5.5.1_3.0_1734310505384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("results_bbolint","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("results_bbolint", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_bbolint| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bbolint/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-results_bbolint_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-results_bbolint_pipeline_en.md new file mode 100644 index 00000000000000..424c2b86757274 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-results_bbolint_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English results_bbolint_pipeline pipeline DistilBertForTokenClassification from bbolint +author: John Snow Labs +name: results_bbolint_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_bbolint_pipeline` is a English model originally trained by bbolint. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_bbolint_pipeline_en_5.5.1_3.0_1734310518424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_bbolint_pipeline_en_5.5.1_3.0_1734310518424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("results_bbolint_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("results_bbolint_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_bbolint_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bbolint/results + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-results_mgbam_en.md b/docs/_posts/ahmedlone127/2024-12-16-results_mgbam_en.md new file mode 100644 index 00000000000000..8d8527ea369417 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-results_mgbam_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English results_mgbam BertForQuestionAnswering from mgbam +author: John Snow Labs +name: results_mgbam +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_mgbam` is a English model originally trained by mgbam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_mgbam_en_5.5.1_3.0_1734338902533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_mgbam_en_5.5.1_3.0_1734338902533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("results_mgbam","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("results_mgbam", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_mgbam| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/mgbam/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-results_mgbam_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-results_mgbam_pipeline_en.md new file mode 100644 index 00000000000000..aabac35794fc4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-results_mgbam_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English results_mgbam_pipeline pipeline BertForQuestionAnswering from mgbam +author: John Snow Labs +name: results_mgbam_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_mgbam_pipeline` is a English model originally trained by mgbam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_mgbam_pipeline_en_5.5.1_3.0_1734338923603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_mgbam_pipeline_en_5.5.1_3.0_1734338923603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("results_mgbam_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("results_mgbam_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_mgbam_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/mgbam/results + +## Included Models + +- MultiDocumentAssembler +- BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-scenario_non_kd_po_ner_full_xlmr_data_univner_half66_en.md b/docs/_posts/ahmedlone127/2024-12-16-scenario_non_kd_po_ner_full_xlmr_data_univner_half66_en.md new file mode 100644 index 00000000000000..984423f0dbfbfb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-scenario_non_kd_po_ner_full_xlmr_data_univner_half66_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English scenario_non_kd_po_ner_full_xlmr_data_univner_half66 XlmRoBertaForTokenClassification from haryoaw +author: John Snow Labs +name: scenario_non_kd_po_ner_full_xlmr_data_univner_half66 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_non_kd_po_ner_full_xlmr_data_univner_half66` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_non_kd_po_ner_full_xlmr_data_univner_half66_en_5.5.1_3.0_1734322611989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_non_kd_po_ner_full_xlmr_data_univner_half66_en_5.5.1_3.0_1734322611989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("scenario_non_kd_po_ner_full_xlmr_data_univner_half66","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("scenario_non_kd_po_ner_full_xlmr_data_univner_half66", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_non_kd_po_ner_full_xlmr_data_univner_half66| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|694.2 MB| + +## References + +https://huggingface.co/haryoaw/scenario-non-kd-po-ner-full-xlmr_data-univner_half66 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline_en.md new file mode 100644 index 00000000000000..903c70c2d8f49b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline pipeline XlmRoBertaForTokenClassification from haryoaw +author: John Snow Labs +name: scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline` is a English model originally trained by haryoaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline_en_5.5.1_3.0_1734322696310.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline_en_5.5.1_3.0_1734322696310.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scenario_non_kd_po_ner_full_xlmr_data_univner_half66_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|694.2 MB| + +## References + +https://huggingface.co/haryoaw/scenario-non-kd-po-ner-full-xlmr_data-univner_half66 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-scientific_paper_summarization_en.md b/docs/_posts/ahmedlone127/2024-12-16-scientific_paper_summarization_en.md new file mode 100644 index 00000000000000..9f2cf97b2e7716 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-scientific_paper_summarization_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English scientific_paper_summarization T5Transformer from GilbertKrantz +author: John Snow Labs +name: scientific_paper_summarization +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scientific_paper_summarization` is a English model originally trained by GilbertKrantz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scientific_paper_summarization_en_5.5.1_3.0_1734327702615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scientific_paper_summarization_en_5.5.1_3.0_1734327702615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("scientific_paper_summarization","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("scientific_paper_summarization", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scientific_paper_summarization| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|318.3 MB| + +## References + +https://huggingface.co/GilbertKrantz/Scientific-Paper-Summarization \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-scientific_paper_summarization_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-scientific_paper_summarization_pipeline_en.md new file mode 100644 index 00000000000000..dbc89806c326f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-scientific_paper_summarization_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English scientific_paper_summarization_pipeline pipeline T5Transformer from GilbertKrantz +author: John Snow Labs +name: scientific_paper_summarization_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scientific_paper_summarization_pipeline` is a English model originally trained by GilbertKrantz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scientific_paper_summarization_pipeline_en_5.5.1_3.0_1734327725599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scientific_paper_summarization_pipeline_en_5.5.1_3.0_1734327725599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("scientific_paper_summarization_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("scientific_paper_summarization_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scientific_paper_summarization_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|318.3 MB| + +## References + +https://huggingface.co/GilbertKrantz/Scientific-Paper-Summarization + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-sembr2023_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2024-12-16-sembr2023_distilbert_base_uncased_en.md new file mode 100644 index 00000000000000..37cd31e4c857d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-sembr2023_distilbert_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English sembr2023_distilbert_base_uncased DistilBertForTokenClassification from admko +author: John Snow Labs +name: sembr2023_distilbert_base_uncased +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sembr2023_distilbert_base_uncased` is a English model originally trained by admko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sembr2023_distilbert_base_uncased_en_5.5.1_3.0_1734310310505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sembr2023_distilbert_base_uncased_en_5.5.1_3.0_1734310310505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("sembr2023_distilbert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("sembr2023_distilbert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sembr2023_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/admko/sembr2023-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-sembr2023_distilbert_base_uncased_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-sembr2023_distilbert_base_uncased_pipeline_en.md new file mode 100644 index 00000000000000..97115e06e95e91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-sembr2023_distilbert_base_uncased_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English sembr2023_distilbert_base_uncased_pipeline pipeline DistilBertForTokenClassification from admko +author: John Snow Labs +name: sembr2023_distilbert_base_uncased_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sembr2023_distilbert_base_uncased_pipeline` is a English model originally trained by admko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sembr2023_distilbert_base_uncased_pipeline_en_5.5.1_3.0_1734310323860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sembr2023_distilbert_base_uncased_pipeline_en_5.5.1_3.0_1734310323860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("sembr2023_distilbert_base_uncased_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("sembr2023_distilbert_base_uncased_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sembr2023_distilbert_base_uncased_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/admko/sembr2023-distilbert-base-uncased + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-somd_train_xlm_v2_en.md b/docs/_posts/ahmedlone127/2024-12-16-somd_train_xlm_v2_en.md new file mode 100644 index 00000000000000..767ca3c99e9749 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-somd_train_xlm_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English somd_train_xlm_v2 XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: somd_train_xlm_v2 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_train_xlm_v2` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_train_xlm_v2_en_5.5.1_3.0_1734323982695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_train_xlm_v2_en_5.5.1_3.0_1734323982695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("somd_train_xlm_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("somd_train_xlm_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_train_xlm_v2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|797.7 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-train-xlm-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-somd_train_xlm_v2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-somd_train_xlm_v2_pipeline_en.md new file mode 100644 index 00000000000000..72e0c9faafec40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-somd_train_xlm_v2_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English somd_train_xlm_v2_pipeline pipeline XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: somd_train_xlm_v2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_train_xlm_v2_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_train_xlm_v2_pipeline_en_5.5.1_3.0_1734324112931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_train_xlm_v2_pipeline_en_5.5.1_3.0_1734324112931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("somd_train_xlm_v2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("somd_train_xlm_v2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_train_xlm_v2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|797.7 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-train-xlm-v2 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-somd_xlm_stage1_pre_v1_en.md b/docs/_posts/ahmedlone127/2024-12-16-somd_xlm_stage1_pre_v1_en.md new file mode 100644 index 00000000000000..3302e108a1a705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-somd_xlm_stage1_pre_v1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English somd_xlm_stage1_pre_v1 XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: somd_xlm_stage1_pre_v1 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_xlm_stage1_pre_v1` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_xlm_stage1_pre_v1_en_5.5.1_3.0_1734323353968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_xlm_stage1_pre_v1_en_5.5.1_3.0_1734323353968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("somd_xlm_stage1_pre_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("somd_xlm_stage1_pre_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_xlm_stage1_pre_v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|790.8 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-xlm-stage1-pre-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-somd_xlm_stage1_pre_v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-somd_xlm_stage1_pre_v1_pipeline_en.md new file mode 100644 index 00000000000000..7de2fa0e14cde3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-somd_xlm_stage1_pre_v1_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English somd_xlm_stage1_pre_v1_pipeline pipeline XlmRoBertaForTokenClassification from ThuyNT03 +author: John Snow Labs +name: somd_xlm_stage1_pre_v1_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`somd_xlm_stage1_pre_v1_pipeline` is a English model originally trained by ThuyNT03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/somd_xlm_stage1_pre_v1_pipeline_en_5.5.1_3.0_1734323486434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/somd_xlm_stage1_pre_v1_pipeline_en_5.5.1_3.0_1734323486434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("somd_xlm_stage1_pre_v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("somd_xlm_stage1_pre_v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|somd_xlm_stage1_pre_v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|790.8 MB| + +## References + +https://huggingface.co/ThuyNT03/SOMD-xlm-stage1-pre-v1 + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-speech_latex2_en.md b/docs/_posts/ahmedlone127/2024-12-16-speech_latex2_en.md new file mode 100644 index 00000000000000..f9683ccdf712f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-speech_latex2_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English speech_latex2 T5Transformer from vinalal +author: John Snow Labs +name: speech_latex2 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`speech_latex2` is a English model originally trained by vinalal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/speech_latex2_en_5.5.1_3.0_1734332232469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/speech_latex2_en_5.5.1_3.0_1734332232469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("speech_latex2","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("speech_latex2", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|speech_latex2| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|317.5 MB| + +## References + +https://huggingface.co/vinalal/speech-latex2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-speech_latex2_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-speech_latex2_pipeline_en.md new file mode 100644 index 00000000000000..2541acd321c99b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-speech_latex2_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English speech_latex2_pipeline pipeline T5Transformer from vinalal +author: John Snow Labs +name: speech_latex2_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`speech_latex2_pipeline` is a English model originally trained by vinalal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/speech_latex2_pipeline_en_5.5.1_3.0_1734332269286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/speech_latex2_pipeline_en_5.5.1_3.0_1734332269286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("speech_latex2_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("speech_latex2_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|speech_latex2_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|317.5 MB| + +## References + +https://huggingface.co/vinalal/speech-latex2 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-srl_bert_en.md b/docs/_posts/ahmedlone127/2024-12-16-srl_bert_en.md new file mode 100644 index 00000000000000..c417b2ce902aaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-srl_bert_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English srl_bert DistilBertForTokenClassification from martincc98 +author: John Snow Labs +name: srl_bert +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, distilbert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`srl_bert` is a English model originally trained by martincc98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/srl_bert_en_5.5.1_3.0_1734311032099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/srl_bert_en_5.5.1_3.0_1734311032099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = DistilBertForTokenClassification.pretrained("srl_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("srl_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|srl_bert| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/martincc98/srl_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-srl_bert_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-srl_bert_pipeline_en.md new file mode 100644 index 00000000000000..b33fe046b1e1ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-srl_bert_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English srl_bert_pipeline pipeline DistilBertForTokenClassification from martincc98 +author: John Snow Labs +name: srl_bert_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`srl_bert_pipeline` is a English model originally trained by martincc98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/srl_bert_pipeline_en_5.5.1_3.0_1734311047133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/srl_bert_pipeline_en_5.5.1_3.0_1734311047133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("srl_bert_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("srl_bert_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|srl_bert_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/martincc98/srl_bert + +## Included Models + +- DocumentAssembler +- TokenizerModel +- DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-summarization_model_dross20_en.md b/docs/_posts/ahmedlone127/2024-12-16-summarization_model_dross20_en.md new file mode 100644 index 00000000000000..32b2079f16ead2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-summarization_model_dross20_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English summarization_model_dross20 T5Transformer from dross20 +author: John Snow Labs +name: summarization_model_dross20 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`summarization_model_dross20` is a English model originally trained by dross20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/summarization_model_dross20_en_5.5.1_3.0_1734328107428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/summarization_model_dross20_en_5.5.1_3.0_1734328107428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("summarization_model_dross20","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("summarization_model_dross20", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|summarization_model_dross20| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|337.9 MB| + +## References + +https://huggingface.co/dross20/summarization_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-summarization_model_dross20_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-summarization_model_dross20_pipeline_en.md new file mode 100644 index 00000000000000..a6fbc08468216b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-summarization_model_dross20_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English summarization_model_dross20_pipeline pipeline T5Transformer from dross20 +author: John Snow Labs +name: summarization_model_dross20_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`summarization_model_dross20_pipeline` is a English model originally trained by dross20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/summarization_model_dross20_pipeline_en_5.5.1_3.0_1734328127703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/summarization_model_dross20_pipeline_en_5.5.1_3.0_1734328127703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("summarization_model_dross20_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("summarization_model_dross20_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|summarization_model_dross20_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|337.9 MB| + +## References + +https://huggingface.co/dross20/summarization_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window12_384_microsoft_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window12_384_microsoft_en.md new file mode 100644 index 00000000000000..24946eb87a2677 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window12_384_microsoft_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_base_patch4_window12_384_microsoft SwinForImageClassification from microsoft +author: John Snow Labs +name: swin_base_patch4_window12_384_microsoft +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_base_patch4_window12_384_microsoft` is a English model originally trained by microsoft. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window12_384_microsoft_en_5.5.1_3.0_1734325603267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window12_384_microsoft_en_5.5.1_3.0_1734325603267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_base_patch4_window12_384_microsoft","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_base_patch4_window12_384_microsoft","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_base_patch4_window12_384_microsoft| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|659.7 MB| + +## References + +https://huggingface.co/microsoft/swin-base-patch4-window12-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window12_384_microsoft_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window12_384_microsoft_pipeline_en.md new file mode 100644 index 00000000000000..012d438b863a2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window12_384_microsoft_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_base_patch4_window12_384_microsoft_pipeline pipeline SwinForImageClassification from microsoft +author: John Snow Labs +name: swin_base_patch4_window12_384_microsoft_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_base_patch4_window12_384_microsoft_pipeline` is a English model originally trained by microsoft. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window12_384_microsoft_pipeline_en_5.5.1_3.0_1734325644479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window12_384_microsoft_pipeline_en_5.5.1_3.0_1734325644479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_base_patch4_window12_384_microsoft_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_base_patch4_window12_384_microsoft_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_base_patch4_window12_384_microsoft_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|659.7 MB| + +## References + +https://huggingface.co/microsoft/swin-base-patch4-window12-384 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_microsoft_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_microsoft_en.md new file mode 100644 index 00000000000000..cfe1a4d7d06c24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_microsoft_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_base_patch4_window7_224_microsoft SwinForImageClassification from microsoft +author: John Snow Labs +name: swin_base_patch4_window7_224_microsoft +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_base_patch4_window7_224_microsoft` is a English model originally trained by microsoft. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_microsoft_en_5.5.1_3.0_1734325556138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_microsoft_en_5.5.1_3.0_1734325556138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_base_patch4_window7_224_microsoft","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_base_patch4_window7_224_microsoft","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_base_patch4_window7_224_microsoft| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|657.4 MB| + +## References + +https://huggingface.co/microsoft/swin-base-patch4-window7-224 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_microsoft_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_microsoft_pipeline_en.md new file mode 100644 index 00000000000000..bf8a9c790a29bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_microsoft_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_base_patch4_window7_224_microsoft_pipeline pipeline SwinForImageClassification from microsoft +author: John Snow Labs +name: swin_base_patch4_window7_224_microsoft_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_base_patch4_window7_224_microsoft_pipeline` is a English model originally trained by microsoft. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_microsoft_pipeline_en_5.5.1_3.0_1734325591325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_microsoft_pipeline_en_5.5.1_3.0_1734325591325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_base_patch4_window7_224_microsoft_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_base_patch4_window7_224_microsoft_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_base_patch4_window7_224_microsoft_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|657.4 MB| + +## References + +https://huggingface.co/microsoft/swin-base-patch4-window7-224 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_rice_disease_02_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_rice_disease_02_en.md new file mode 100644 index 00000000000000..68f80c280af6bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_rice_disease_02_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_base_patch4_window7_224_rice_disease_02 SwinForImageClassification from cvmil +author: John Snow Labs +name: swin_base_patch4_window7_224_rice_disease_02 +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_base_patch4_window7_224_rice_disease_02` is a English model originally trained by cvmil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_rice_disease_02_en_5.5.1_3.0_1734325322083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_rice_disease_02_en_5.5.1_3.0_1734325322083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_base_patch4_window7_224_rice_disease_02","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_base_patch4_window7_224_rice_disease_02","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_base_patch4_window7_224_rice_disease_02| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/cvmil/swin-base-patch4-window7-224_rice-disease-02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_rice_disease_02_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_rice_disease_02_pipeline_en.md new file mode 100644 index 00000000000000..11a302574bbee5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_base_patch4_window7_224_rice_disease_02_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_base_patch4_window7_224_rice_disease_02_pipeline pipeline SwinForImageClassification from cvmil +author: John Snow Labs +name: swin_base_patch4_window7_224_rice_disease_02_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_base_patch4_window7_224_rice_disease_02_pipeline` is a English model originally trained by cvmil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_rice_disease_02_pipeline_en_5.5.1_3.0_1734325355573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_base_patch4_window7_224_rice_disease_02_pipeline_en_5.5.1_3.0_1734325355573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_base_patch4_window7_224_rice_disease_02_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_base_patch4_window7_224_rice_disease_02_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_base_patch4_window7_224_rice_disease_02_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/cvmil/swin-base-patch4-window7-224_rice-disease-02 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_finetuned_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_finetuned_en.md new file mode 100644 index 00000000000000..48eaf83ccc6001 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_finetuned_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_finetuned SwinForImageClassification from kijeong22 +author: John Snow Labs +name: swin_finetuned +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_finetuned` is a English model originally trained by kijeong22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_finetuned_en_5.5.1_3.0_1734325399925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_finetuned_en_5.5.1_3.0_1734325399925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_finetuned","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_finetuned","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_finetuned| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/kijeong22/swin-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_finetuned_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_finetuned_pipeline_en.md new file mode 100644 index 00000000000000..90efdaa2514b3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_finetuned_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_finetuned_pipeline pipeline SwinForImageClassification from kijeong22 +author: John Snow Labs +name: swin_finetuned_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_finetuned_pipeline` is a English model originally trained by kijeong22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_finetuned_pipeline_en_5.5.1_3.0_1734325432106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_finetuned_pipeline_en_5.5.1_3.0_1734325432106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_finetuned_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_finetuned_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_finetuned_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|649.8 MB| + +## References + +https://huggingface.co/kijeong22/swin-finetuned + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_small_finetuned_cifar100_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_small_finetuned_cifar100_en.md new file mode 100644 index 00000000000000..6f2ab361567b10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_small_finetuned_cifar100_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_small_finetuned_cifar100 SwinForImageClassification from MazenAmria +author: John Snow Labs +name: swin_small_finetuned_cifar100 +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_small_finetuned_cifar100` is a English model originally trained by MazenAmria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_small_finetuned_cifar100_en_5.5.1_3.0_1734325755656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_small_finetuned_cifar100_en_5.5.1_3.0_1734325755656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_small_finetuned_cifar100","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_small_finetuned_cifar100","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_small_finetuned_cifar100| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|366.8 MB| + +## References + +https://huggingface.co/MazenAmria/swin-small-finetuned-cifar100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_small_finetuned_cifar100_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_small_finetuned_cifar100_pipeline_en.md new file mode 100644 index 00000000000000..dd1e47790bdca9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_small_finetuned_cifar100_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_small_finetuned_cifar100_pipeline pipeline SwinForImageClassification from MazenAmria +author: John Snow Labs +name: swin_small_finetuned_cifar100_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_small_finetuned_cifar100_pipeline` is a English model originally trained by MazenAmria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_small_finetuned_cifar100_pipeline_en_5.5.1_3.0_1734325774238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_small_finetuned_cifar100_pipeline_en_5.5.1_3.0_1734325774238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_small_finetuned_cifar100_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_small_finetuned_cifar100_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_small_finetuned_cifar100_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|366.8 MB| + +## References + +https://huggingface.co/MazenAmria/swin-small-finetuned-cifar100 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_small_patch4_window7_224_finetuned_isic217_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_small_patch4_window7_224_finetuned_isic217_en.md new file mode 100644 index 00000000000000..8f4880db11e356 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_small_patch4_window7_224_finetuned_isic217_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_small_patch4_window7_224_finetuned_isic217 SwinForImageClassification from vananhle +author: John Snow Labs +name: swin_small_patch4_window7_224_finetuned_isic217 +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_small_patch4_window7_224_finetuned_isic217` is a English model originally trained by vananhle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_small_patch4_window7_224_finetuned_isic217_en_5.5.1_3.0_1734325544695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_small_patch4_window7_224_finetuned_isic217_en_5.5.1_3.0_1734325544695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_small_patch4_window7_224_finetuned_isic217","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_small_patch4_window7_224_finetuned_isic217","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_small_patch4_window7_224_finetuned_isic217| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|366.2 MB| + +## References + +https://huggingface.co/vananhle/swin-small-patch4-window7-224-finetuned-isic217 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_small_patch4_window7_224_finetuned_isic217_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_small_patch4_window7_224_finetuned_isic217_pipeline_en.md new file mode 100644 index 00000000000000..748a01a8c6a635 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_small_patch4_window7_224_finetuned_isic217_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_small_patch4_window7_224_finetuned_isic217_pipeline pipeline SwinForImageClassification from vananhle +author: John Snow Labs +name: swin_small_patch4_window7_224_finetuned_isic217_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_small_patch4_window7_224_finetuned_isic217_pipeline` is a English model originally trained by vananhle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_small_patch4_window7_224_finetuned_isic217_pipeline_en_5.5.1_3.0_1734325563614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_small_patch4_window7_224_finetuned_isic217_pipeline_en_5.5.1_3.0_1734325563614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_small_patch4_window7_224_finetuned_isic217_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_small_patch4_window7_224_finetuned_isic217_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_small_patch4_window7_224_finetuned_isic217_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|366.3 MB| + +## References + +https://huggingface.co/vananhle/swin-small-patch4-window7-224-finetuned-isic217 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_cats_dogs_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_cats_dogs_en.md new file mode 100644 index 00000000000000..efa59c494925e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_cats_dogs_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_cats_dogs SwinForImageClassification from tommilyjones +author: John Snow Labs +name: swin_tiny_patch4_window7_224_cats_dogs +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_cats_dogs` is a English model originally trained by tommilyjones. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_cats_dogs_en_5.5.1_3.0_1734325650742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_cats_dogs_en_5.5.1_3.0_1734325650742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_cats_dogs","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_cats_dogs","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_cats_dogs| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/tommilyjones/swin-tiny-patch4-window7-224-cats_dogs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_cats_dogs_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_cats_dogs_pipeline_en.md new file mode 100644 index 00000000000000..c8bb500f434959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_cats_dogs_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_cats_dogs_pipeline pipeline SwinForImageClassification from tommilyjones +author: John Snow Labs +name: swin_tiny_patch4_window7_224_cats_dogs_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_cats_dogs_pipeline` is a English model originally trained by tommilyjones. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_cats_dogs_pipeline_en_5.5.1_3.0_1734325672097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_cats_dogs_pipeline_en_5.5.1_3.0_1734325672097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_tiny_patch4_window7_224_cats_dogs_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_tiny_patch4_window7_224_cats_dogs_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_cats_dogs_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/tommilyjones/swin-tiny-patch4-window7-224-cats_dogs + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_en.md new file mode 100644 index 00000000000000..9b8e00de3c031c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr SwinForImageClassification from nielsr +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr` is a English model originally trained by nielsr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_en_5.5.1_3.0_1734325271458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_en_5.5.1_3.0_1734325271458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.4 MB| + +## References + +https://huggingface.co/nielsr/swin-tiny-patch4-window7-224-finetuned-cifar10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline_en.md new file mode 100644 index 00000000000000..09691d28151f90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline pipeline SwinForImageClassification from nielsr +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline` is a English model originally trained by nielsr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline_en_5.5.1_3.0_1734325281950.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline_en_5.5.1_3.0_1734325281950.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_cifar10_nielsr_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.4 MB| + +## References + +https://huggingface.co/nielsr/swin-tiny-patch4-window7-224-finetuned-cifar10 + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_en.md new file mode 100644 index 00000000000000..4c6dea4f39ea91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro SwinForImageClassification from andrecastro +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro` is a English model originally trained by andrecastro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_en_5.5.1_3.0_1734324879268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_en_5.5.1_3.0_1734324879268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/andrecastro/swin-tiny-patch4-window7-224-finetuned-eurosat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline_en.md new file mode 100644 index 00000000000000..d5c88bfd86bcc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline pipeline SwinForImageClassification from andrecastro +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline` is a English model originally trained by andrecastro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline_en_5.5.1_3.0_1734324890161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline_en_5.5.1_3.0_1734324890161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_eurosat_andrecastro_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/andrecastro/swin-tiny-patch4-window7-224-finetuned-eurosat + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_en.md new file mode 100644 index 00000000000000..23c5508d0c22a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska SwinForImageClassification from polejowska +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska` is a English model originally trained by polejowska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_en_5.5.1_3.0_1734325100481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_en_5.5.1_3.0_1734325100481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/polejowska/swin-tiny-patch4-window7-224-finetuned-eurosat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline_en.md new file mode 100644 index 00000000000000..362bbd970e0fcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline pipeline SwinForImageClassification from polejowska +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline` is a English model originally trained by polejowska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline_en_5.5.1_3.0_1734325111019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline_en_5.5.1_3.0_1734325111019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_eurosat_polejowska_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.4 MB| + +## References + +https://huggingface.co/polejowska/swin-tiny-patch4-window7-224-finetuned-eurosat + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_rcc_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_rcc_en.md new file mode 100644 index 00000000000000..fc8c3f8963d42a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_rcc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_rcc SwinForImageClassification from synergyai-jaeung +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_rcc +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_rcc` is a English model originally trained by synergyai-jaeung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_rcc_en_5.5.1_3.0_1734325035960.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_rcc_en_5.5.1_3.0_1734325035960.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_finetuned_rcc","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_finetuned_rcc","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_rcc| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/synergyai-jaeung/swin-tiny-patch4-window7-224-finetuned-RCC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_rcc_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_rcc_pipeline_en.md new file mode 100644 index 00000000000000..9f76eaa082868d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_finetuned_rcc_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_finetuned_rcc_pipeline pipeline SwinForImageClassification from synergyai-jaeung +author: John Snow Labs +name: swin_tiny_patch4_window7_224_finetuned_rcc_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_finetuned_rcc_pipeline` is a English model originally trained by synergyai-jaeung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_rcc_pipeline_en_5.5.1_3.0_1734325046388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_finetuned_rcc_pipeline_en_5.5.1_3.0_1734325046388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_rcc_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_tiny_patch4_window7_224_finetuned_rcc_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_finetuned_rcc_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/synergyai-jaeung/swin-tiny-patch4-window7-224-finetuned-RCC + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_mm_classification_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_mm_classification_en.md new file mode 100644 index 00000000000000..ba0458ea10e3a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_mm_classification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_mm_classification SwinForImageClassification from djbp +author: John Snow Labs +name: swin_tiny_patch4_window7_224_mm_classification +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_mm_classification` is a English model originally trained by djbp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_mm_classification_en_5.5.1_3.0_1734324904734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_mm_classification_en_5.5.1_3.0_1734324904734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_mm_classification","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_mm_classification","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_mm_classification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/djbp/swin-tiny-patch4-window7-224-MM_Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_mm_classification_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_mm_classification_pipeline_en.md new file mode 100644 index 00000000000000..cdbfe7ae7e8276 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_mm_classification_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_mm_classification_pipeline pipeline SwinForImageClassification from djbp +author: John Snow Labs +name: swin_tiny_patch4_window7_224_mm_classification_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_mm_classification_pipeline` is a English model originally trained by djbp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_mm_classification_pipeline_en_5.5.1_3.0_1734324915227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_mm_classification_pipeline_en_5.5.1_3.0_1734324915227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("swin_tiny_patch4_window7_224_mm_classification_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("swin_tiny_patch4_window7_224_mm_classification_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_mm_classification_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/djbp/swin-tiny-patch4-window7-224-MM_Classification + +## Included Models + +- ImageAssembler +- SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat_en.md b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat_en.md new file mode 100644 index 00000000000000..88157c06daf86c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat SwinForImageClassification from ezzouhri +author: John Snow Labs +name: swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat` is a English model originally trained by ezzouhri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat_en_5.5.1_3.0_1734325017992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat_en_5.5.1_3.0_1734325017992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swin_tiny_patch4_window7_224_seg_swin_amal_finetuned_eurosat| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|206.3 MB| + +## References + +https://huggingface.co/ezzouhri/swin-tiny-patch4-window7-224-seg-swin-amal-finetuned-eurosat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_base_for2inf_music_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_base_for2inf_music_en.md new file mode 100644 index 00000000000000..0bd6059944e227 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_base_for2inf_music_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_base_for2inf_music T5Transformer from ggallipoli +author: John Snow Labs +name: t5_base_for2inf_music +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_base_for2inf_music` is a English model originally trained by ggallipoli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_base_for2inf_music_en_5.5.1_3.0_1734331542059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_base_for2inf_music_en_5.5.1_3.0_1734331542059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_base_for2inf_music","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_base_for2inf_music", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_base_for2inf_music| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ggallipoli/t5-base_for2inf_music \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_base_for2inf_music_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_base_for2inf_music_pipeline_en.md new file mode 100644 index 00000000000000..9ca837d16c0218 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_base_for2inf_music_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_base_for2inf_music_pipeline pipeline T5Transformer from ggallipoli +author: John Snow Labs +name: t5_base_for2inf_music_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_base_for2inf_music_pipeline` is a English model originally trained by ggallipoli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_base_for2inf_music_pipeline_en_5.5.1_3.0_1734331593423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_base_for2inf_music_pipeline_en_5.5.1_3.0_1734331593423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_base_for2inf_music_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_base_for2inf_music_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_base_for2inf_music_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/ggallipoli/t5-base_for2inf_music + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_base_grammar_checker_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_base_grammar_checker_en.md new file mode 100644 index 00000000000000..350a35c2deba8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_base_grammar_checker_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English t5_base_grammar_checker T5Transformer from Ragnov +author: John Snow Labs +name: t5_base_grammar_checker +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_base_grammar_checker` is a English model originally trained by Ragnov. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_base_grammar_checker_en_5.5.1_3.0_1734329113743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_base_grammar_checker_en_5.5.1_3.0_1734329113743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_base_grammar_checker","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_base_grammar_checker", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_base_grammar_checker| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|974.2 MB| + +## References + +References + +https://huggingface.co/Ragnov/T5-Base-Grammar-Checker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_base_grammar_checker_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_base_grammar_checker_pipeline_en.md new file mode 100644 index 00000000000000..f2e09cbb2294b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_base_grammar_checker_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English t5_base_grammar_checker_pipeline pipeline T5Transformer from Ragnov +author: John Snow Labs +name: t5_base_grammar_checker_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_base_grammar_checker_pipeline` is a English model originally trained by Ragnov. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_base_grammar_checker_pipeline_en_5.5.1_3.0_1734329172450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_base_grammar_checker_pipeline_en_5.5.1_3.0_1734329172450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("t5_base_grammar_checker_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("t5_base_grammar_checker_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_base_grammar_checker_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|974.2 MB| + +## References + +References + +https://huggingface.co/Ragnov/T5-Base-Grammar-Checker + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_fine_tuned_model_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_fine_tuned_model_en.md new file mode 100644 index 00000000000000..91276be6ec3ba9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_fine_tuned_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English T5ForConditionalGeneration Cased model (from marcus2000) +author: John Snow Labs +name: t5_fine_tuned_model +date: 2024-12-16 +tags: [en, open_source, t5, onnx] +task: Text Generation +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5ForConditionalGeneration model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fine_tuned_t5_model` is a English model originally trained by `marcus2000`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_fine_tuned_model_en_5.5.1_3.0_1734328808232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_fine_tuned_model_en_5.5.1_3.0_1734328808232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCols("text") \ + .setOutputCols("document") + +t5 = T5Transformer.pretrained("t5_fine_tuned_model","en") \ + .setInputCols("document") \ + .setOutputCol("answers") + +pipeline = Pipeline(stages=[documentAssembler, t5]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_fine_tuned_model","en") + .setInputCols("document") + .setOutputCol("answers") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_fine_tuned_model| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|309.4 MB| + +## References + +References + +References + +- https://huggingface.co/marcus2000/fine_tuned_t5_model +- https://paperswithcode.com/sota?task=automatic-speech-recognition&dataset=Librispeech+%28clean%29 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_fine_tuned_model_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_fine_tuned_model_pipeline_en.md new file mode 100644 index 00000000000000..a037d6b8bb09fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_fine_tuned_model_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English t5_fine_tuned_model_pipeline pipeline T5Transformer from marcus2000 +author: John Snow Labs +name: t5_fine_tuned_model_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_fine_tuned_model_pipeline` is a English model originally trained by marcus2000. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_fine_tuned_model_pipeline_en_5.5.1_3.0_1734328832195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_fine_tuned_model_pipeline_en_5.5.1_3.0_1734328832195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("t5_fine_tuned_model_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("t5_fine_tuned_model_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_fine_tuned_model_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|309.4 MB| + +## References + +References + +https://huggingface.co/marcus2000/fine_tuned_t5_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_large_pos2neg_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_large_pos2neg_en.md new file mode 100644 index 00000000000000..058dd4c3d6dd47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_large_pos2neg_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_large_pos2neg T5Transformer from ggallipoli +author: John Snow Labs +name: t5_large_pos2neg +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_large_pos2neg` is a English model originally trained by ggallipoli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_large_pos2neg_en_5.5.1_3.0_1734330805129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_large_pos2neg_en_5.5.1_3.0_1734330805129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_large_pos2neg","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_large_pos2neg", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_large_pos2neg| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|3.0 GB| + +## References + +https://huggingface.co/ggallipoli/t5-large_pos2neg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_large_pos2neg_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_large_pos2neg_pipeline_en.md new file mode 100644 index 00000000000000..23d813facf54ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_large_pos2neg_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_large_pos2neg_pipeline pipeline T5Transformer from ggallipoli +author: John Snow Labs +name: t5_large_pos2neg_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_large_pos2neg_pipeline` is a English model originally trained by ggallipoli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_large_pos2neg_pipeline_en_5.5.1_3.0_1734330952464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_large_pos2neg_pipeline_en_5.5.1_3.0_1734330952464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_large_pos2neg_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_large_pos2neg_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_large_pos2neg_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|3.0 GB| + +## References + +https://huggingface.co/ggallipoli/t5-large_pos2neg + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_autotagging_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_autotagging_en.md new file mode 100644 index 00000000000000..26a391ab773933 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_autotagging_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_autotagging T5Transformer from RevoltronTechno +author: John Snow Labs +name: t5_small_autotagging +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_autotagging` is a English model originally trained by RevoltronTechno. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_autotagging_en_5.5.1_3.0_1734328064501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_autotagging_en_5.5.1_3.0_1734328064501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_autotagging","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_autotagging", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_autotagging| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|341.9 MB| + +## References + +https://huggingface.co/RevoltronTechno/t5_small_autotagging \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_autotagging_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_autotagging_pipeline_en.md new file mode 100644 index 00000000000000..5c164c4b03f769 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_autotagging_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_autotagging_pipeline pipeline T5Transformer from RevoltronTechno +author: John Snow Labs +name: t5_small_autotagging_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_autotagging_pipeline` is a English model originally trained by RevoltronTechno. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_autotagging_pipeline_en_5.5.1_3.0_1734328086039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_autotagging_pipeline_en_5.5.1_3.0_1734328086039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_autotagging_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_autotagging_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_autotagging_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|341.9 MB| + +## References + +https://huggingface.co/RevoltronTechno/t5_small_autotagging + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_aspect_01_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_aspect_01_en.md new file mode 100644 index 00000000000000..b3a6fc9173f407 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_aspect_01_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_finetuned_aspect_01 T5Transformer from Ftmhd +author: John Snow Labs +name: t5_small_finetuned_aspect_01 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_aspect_01` is a English model originally trained by Ftmhd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_aspect_01_en_5.5.1_3.0_1734330822734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_aspect_01_en_5.5.1_3.0_1734330822734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_finetuned_aspect_01","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_finetuned_aspect_01", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_aspect_01| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|302.9 MB| + +## References + +https://huggingface.co/Ftmhd/t5-small-finetuned-aspect_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_aspect_01_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_aspect_01_pipeline_en.md new file mode 100644 index 00000000000000..11e009a958730c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_aspect_01_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_finetuned_aspect_01_pipeline pipeline T5Transformer from Ftmhd +author: John Snow Labs +name: t5_small_finetuned_aspect_01_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_aspect_01_pipeline` is a English model originally trained by Ftmhd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_aspect_01_pipeline_en_5.5.1_3.0_1734330849297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_aspect_01_pipeline_en_5.5.1_3.0_1734330849297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_finetuned_aspect_01_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_finetuned_aspect_01_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_aspect_01_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|302.9 MB| + +## References + +https://huggingface.co/Ftmhd/t5-small-finetuned-aspect_01 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_english_tonga_tonga_islands_english_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_english_tonga_tonga_islands_english_en.md new file mode 100644 index 00000000000000..a3ae71134e543f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_english_tonga_tonga_islands_english_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_finetuned_english_tonga_tonga_islands_english T5Transformer from nirubuh +author: John Snow Labs +name: t5_small_finetuned_english_tonga_tonga_islands_english +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_english_tonga_tonga_islands_english` is a English model originally trained by nirubuh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_english_tonga_tonga_islands_english_en_5.5.1_3.0_1734327608198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_english_tonga_tonga_islands_english_en_5.5.1_3.0_1734327608198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_finetuned_english_tonga_tonga_islands_english","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_finetuned_english_tonga_tonga_islands_english", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_english_tonga_tonga_islands_english| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|334.8 MB| + +## References + +https://huggingface.co/nirubuh/t5-small-finetuned-en-to-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_english_tonga_tonga_islands_english_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_english_tonga_tonga_islands_english_pipeline_en.md new file mode 100644 index 00000000000000..e417679f20b50d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_english_tonga_tonga_islands_english_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_finetuned_english_tonga_tonga_islands_english_pipeline pipeline T5Transformer from nirubuh +author: John Snow Labs +name: t5_small_finetuned_english_tonga_tonga_islands_english_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_english_tonga_tonga_islands_english_pipeline` is a English model originally trained by nirubuh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_english_tonga_tonga_islands_english_pipeline_en_5.5.1_3.0_1734327628092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_english_tonga_tonga_islands_english_pipeline_en_5.5.1_3.0_1734327628092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_finetuned_english_tonga_tonga_islands_english_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_finetuned_english_tonga_tonga_islands_english_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_english_tonga_tonga_islands_english_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|334.8 MB| + +## References + +https://huggingface.co/nirubuh/t5-small-finetuned-en-to-en + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_news_ftmhd_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_news_ftmhd_en.md new file mode 100644 index 00000000000000..49f8012408bffa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_news_ftmhd_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_finetuned_news_ftmhd T5Transformer from Ftmhd +author: John Snow Labs +name: t5_small_finetuned_news_ftmhd +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_news_ftmhd` is a English model originally trained by Ftmhd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_news_ftmhd_en_5.5.1_3.0_1734329607083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_news_ftmhd_en_5.5.1_3.0_1734329607083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_finetuned_news_ftmhd","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_finetuned_news_ftmhd", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_news_ftmhd| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|320.9 MB| + +## References + +https://huggingface.co/Ftmhd/t5-small-finetuned-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_samsum_vasumathin298_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_samsum_vasumathin298_en.md new file mode 100644 index 00000000000000..2e1033ef109dfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_samsum_vasumathin298_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_finetuned_samsum_vasumathin298 T5Transformer from vasumathin298 +author: John Snow Labs +name: t5_small_finetuned_samsum_vasumathin298 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_samsum_vasumathin298` is a English model originally trained by vasumathin298. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_samsum_vasumathin298_en_5.5.1_3.0_1734332804754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_samsum_vasumathin298_en_5.5.1_3.0_1734332804754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_finetuned_samsum_vasumathin298","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_finetuned_samsum_vasumathin298", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_samsum_vasumathin298| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|328.2 MB| + +## References + +https://huggingface.co/vasumathin298/t5-small-finetuned-samsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_samsum_vasumathin298_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_samsum_vasumathin298_pipeline_en.md new file mode 100644 index 00000000000000..d36c19daf626ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_samsum_vasumathin298_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_finetuned_samsum_vasumathin298_pipeline pipeline T5Transformer from vasumathin298 +author: John Snow Labs +name: t5_small_finetuned_samsum_vasumathin298_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_samsum_vasumathin298_pipeline` is a English model originally trained by vasumathin298. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_samsum_vasumathin298_pipeline_en_5.5.1_3.0_1734332827007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_samsum_vasumathin298_pipeline_en_5.5.1_3.0_1734332827007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_finetuned_samsum_vasumathin298_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_finetuned_samsum_vasumathin298_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_samsum_vasumathin298_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|328.2 MB| + +## References + +https://huggingface.co/vasumathin298/t5-small-finetuned-samsum + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_arinzeo_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_arinzeo_en.md new file mode 100644 index 00000000000000..834e81ef9b490e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_arinzeo_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_finetuned_xsum_arinzeo T5Transformer from arinzeo +author: John Snow Labs +name: t5_small_finetuned_xsum_arinzeo +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_xsum_arinzeo` is a English model originally trained by arinzeo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_arinzeo_en_5.5.1_3.0_1734333117390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_arinzeo_en_5.5.1_3.0_1734333117390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_finetuned_xsum_arinzeo","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_finetuned_xsum_arinzeo", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_xsum_arinzeo| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|330.3 MB| + +## References + +https://huggingface.co/arinzeo/t5-small-finetuned-xsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_arinzeo_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_arinzeo_pipeline_en.md new file mode 100644 index 00000000000000..bdc4809ee461e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_arinzeo_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_finetuned_xsum_arinzeo_pipeline pipeline T5Transformer from arinzeo +author: John Snow Labs +name: t5_small_finetuned_xsum_arinzeo_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_xsum_arinzeo_pipeline` is a English model originally trained by arinzeo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_arinzeo_pipeline_en_5.5.1_3.0_1734333138396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_arinzeo_pipeline_en_5.5.1_3.0_1734333138396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_finetuned_xsum_arinzeo_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_finetuned_xsum_arinzeo_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_xsum_arinzeo_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|330.3 MB| + +## References + +https://huggingface.co/arinzeo/t5-small-finetuned-xsum + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_danish24_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_danish24_en.md new file mode 100644 index 00000000000000..f98debbe3079f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_danish24_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5_small_finetuned_xsum_danish24 T5Transformer from Danish24 +author: John Snow Labs +name: t5_small_finetuned_xsum_danish24 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_xsum_danish24` is a English model originally trained by Danish24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_danish24_en_5.5.1_3.0_1734327127831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_danish24_en_5.5.1_3.0_1734327127831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_small_finetuned_xsum_danish24","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_small_finetuned_xsum_danish24", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_xsum_danish24| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|338.9 MB| + +## References + +https://huggingface.co/Danish24/t5-small-finetuned-xsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_danish24_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_danish24_pipeline_en.md new file mode 100644 index 00000000000000..49de6c6c9b0ddb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_small_finetuned_xsum_danish24_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English t5_small_finetuned_xsum_danish24_pipeline pipeline T5Transformer from Danish24 +author: John Snow Labs +name: t5_small_finetuned_xsum_danish24_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_small_finetuned_xsum_danish24_pipeline` is a English model originally trained by Danish24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_danish24_pipeline_en_5.5.1_3.0_1734327150804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_small_finetuned_xsum_danish24_pipeline_en_5.5.1_3.0_1734327150804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("t5_small_finetuned_xsum_danish24_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("t5_small_finetuned_xsum_danish24_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_small_finetuned_xsum_danish24_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|338.9 MB| + +## References + +https://huggingface.co/Danish24/t5-small-finetuned-xsum + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5_spell_base_ru.md b/docs/_posts/ahmedlone127/2024-12-16-t5_spell_base_ru.md new file mode 100644 index 00000000000000..04958b8d02e75c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5_spell_base_ru.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Russian t5_spell_base T5Transformer from Grpp +author: John Snow Labs +name: t5_spell_base +date: 2024-12-16 +tags: [ru, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: ru +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5_spell_base` is a Russian model originally trained by Grpp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5_spell_base_ru_5.5.1_3.0_1734332602928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5_spell_base_ru_5.5.1_3.0_1734332602928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5_spell_base","ru") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5_spell_base", "ru") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5_spell_base| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|ru| +|Size:|997.2 MB| + +## References + +https://huggingface.co/Grpp/T5_spell-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-t5small_enfr_opus_en.md b/docs/_posts/ahmedlone127/2024-12-16-t5small_enfr_opus_en.md new file mode 100644 index 00000000000000..150fc4a44c1673 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-t5small_enfr_opus_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English t5small_enfr_opus T5Transformer from Mat17892 +author: John Snow Labs +name: t5small_enfr_opus +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`t5small_enfr_opus` is a English model originally trained by Mat17892. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/t5small_enfr_opus_en_5.5.1_3.0_1734332084384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/t5small_enfr_opus_en_5.5.1_3.0_1734332084384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("t5small_enfr_opus","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("t5small_enfr_opus", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|t5small_enfr_opus| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|348.8 MB| + +## References + +https://huggingface.co/Mat17892/t5small_enfr_opus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-test_bertlike_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-test_bertlike_ner_en.md new file mode 100644 index 00000000000000..5a3a6393de5b86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-test_bertlike_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English test_bertlike_ner BertForTokenClassification from witalo +author: John Snow Labs +name: test_bertlike_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bertlike_ner` is a English model originally trained by witalo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bertlike_ner_en_5.5.1_3.0_1734336915271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bertlike_ner_en_5.5.1_3.0_1734336915271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("test_bertlike_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("test_bertlike_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bertlike_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/witalo/test-bertlike-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-text2triple_flan_t5_en.md b/docs/_posts/ahmedlone127/2024-12-16-text2triple_flan_t5_en.md new file mode 100644 index 00000000000000..4c44814dcfb08e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-text2triple_flan_t5_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English text2triple_flan_t5 T5Transformer from pat-jj +author: John Snow Labs +name: text2triple_flan_t5 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text2triple_flan_t5` is a English model originally trained by pat-jj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text2triple_flan_t5_en_5.5.1_3.0_1734333823769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text2triple_flan_t5_en_5.5.1_3.0_1734333823769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("text2triple_flan_t5","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("text2triple_flan_t5", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text2triple_flan_t5| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|3.1 GB| + +## References + +https://huggingface.co/pat-jj/text2triple-flan-t5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-text2triple_flan_t5_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-text2triple_flan_t5_pipeline_en.md new file mode 100644 index 00000000000000..47d09558610d6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-text2triple_flan_t5_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English text2triple_flan_t5_pipeline pipeline T5Transformer from pat-jj +author: John Snow Labs +name: text2triple_flan_t5_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text2triple_flan_t5_pipeline` is a English model originally trained by pat-jj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text2triple_flan_t5_pipeline_en_5.5.1_3.0_1734333970588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text2triple_flan_t5_pipeline_en_5.5.1_3.0_1734333970588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("text2triple_flan_t5_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("text2triple_flan_t5_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text2triple_flan_t5_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|3.1 GB| + +## References + +https://huggingface.co/pat-jj/text2triple-flan-t5 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_en.md b/docs/_posts/ahmedlone127/2024-12-16-text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_en.md new file mode 100644 index 00000000000000..1c138b3b582e1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05 T5Transformer from PopularPenguin +author: John Snow Labs +name: text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05` is a English model originally trained by PopularPenguin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_en_5.5.1_3.0_1734327355502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_en_5.5.1_3.0_1734327355502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/PopularPenguin/text-to-sparql-t5-base-2024-10-01_04-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline_en.md new file mode 100644 index 00000000000000..38f9e270598adb --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline pipeline T5Transformer from PopularPenguin +author: John Snow Labs +name: text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline` is a English model originally trained by PopularPenguin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline_en_5.5.1_3.0_1734327407524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline_en_5.5.1_3.0_1734327407524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_tonga_tonga_islands_sparql_t5_base_2024_10_01_04_05_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/PopularPenguin/text-to-sparql-t5-base-2024-10-01_04-05 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-tiny_random_swinforimageclassification_en.md b/docs/_posts/ahmedlone127/2024-12-16-tiny_random_swinforimageclassification_en.md new file mode 100644 index 00000000000000..aa9ed5002212e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-tiny_random_swinforimageclassification_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tiny_random_swinforimageclassification SwinForImageClassification from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_swinforimageclassification +date: 2024-12-16 +tags: [en, open_source, onnx, image_classification, swin] +task: Image Classification +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: SwinForImageClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained SwinForImageClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_swinforimageclassification` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_swinforimageclassification_en_5.5.1_3.0_1734325332366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_swinforimageclassification_en_5.5.1_3.0_1734325332366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + + + +image_assembler = ImageAssembler()\ + .setInputCol("image")\ + .setOutputCol("image_assembler") + +imageClassifier = SwinForImageClassification.pretrained(""tiny_random_swinforimageclassification","en")\ + .setInputCols("image_assembler")\ + .setOutputCol("class") + +pipeline = Pipeline(stages=[ + image_assembler, + imageClassifier, +]) + +pipelineModel = pipeline.fit(imageDF) + +pipelineDF = pipelineModel.transform(imageDF) + +``` +```scala + +val imageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") + +val imageClassifier = SwinForImageClassification.pretrained("tiny_random_swinforimageclassification","en") + .setInputCols("image_assembler") + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(imageAssembler, imageClassifier)) + +val pipelineModel = pipeline.fit(imageDF) + +val pipelineDF = pipelineModel.transform(imageDF) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_swinforimageclassification| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[label]| +|Language:|en| +|Size:|547.4 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-SwinForImageClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-tner_xlm_roberta_base_ontonotes5_earnings21_normalized_en.md b/docs/_posts/ahmedlone127/2024-12-16-tner_xlm_roberta_base_ontonotes5_earnings21_normalized_en.md new file mode 100644 index 00000000000000..5806687af0c5c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-tner_xlm_roberta_base_ontonotes5_earnings21_normalized_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English tner_xlm_roberta_base_ontonotes5_earnings21_normalized XlmRoBertaForTokenClassification from anonymoussubmissions +author: John Snow Labs +name: tner_xlm_roberta_base_ontonotes5_earnings21_normalized +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tner_xlm_roberta_base_ontonotes5_earnings21_normalized` is a English model originally trained by anonymoussubmissions. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tner_xlm_roberta_base_ontonotes5_earnings21_normalized_en_5.5.1_3.0_1734321826563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tner_xlm_roberta_base_ontonotes5_earnings21_normalized_en_5.5.1_3.0_1734321826563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("tner_xlm_roberta_base_ontonotes5_earnings21_normalized","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("tner_xlm_roberta_base_ontonotes5_earnings21_normalized", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tner_xlm_roberta_base_ontonotes5_earnings21_normalized| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|859.6 MB| + +## References + +https://huggingface.co/anonymoussubmissions/tner-xlm-roberta-base-ontonotes5-earnings21-normalized \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline_en.md new file mode 100644 index 00000000000000..1c61261ae9c9fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline pipeline XlmRoBertaForTokenClassification from anonymoussubmissions +author: John Snow Labs +name: tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline` is a English model originally trained by anonymoussubmissions. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline_en_5.5.1_3.0_1734321890206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline_en_5.5.1_3.0_1734321890206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tner_xlm_roberta_base_ontonotes5_earnings21_normalized_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|859.6 MB| + +## References + +https://huggingface.co/anonymoussubmissions/tner-xlm-roberta-base-ontonotes5-earnings21-normalized + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-tokenizertestingmtsufall2024softwareengineering_en.md b/docs/_posts/ahmedlone127/2024-12-16-tokenizertestingmtsufall2024softwareengineering_en.md new file mode 100644 index 00000000000000..4f893d9ad2ca37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-tokenizertestingmtsufall2024softwareengineering_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English tokenizertestingmtsufall2024softwareengineering T5Transformer from cheaptrix +author: John Snow Labs +name: tokenizertestingmtsufall2024softwareengineering +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tokenizertestingmtsufall2024softwareengineering` is a English model originally trained by cheaptrix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tokenizertestingmtsufall2024softwareengineering_en_5.5.1_3.0_1734327728999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tokenizertestingmtsufall2024softwareengineering_en_5.5.1_3.0_1734327728999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("tokenizertestingmtsufall2024softwareengineering","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("tokenizertestingmtsufall2024softwareengineering", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tokenizertestingmtsufall2024softwareengineering| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|341.0 MB| + +## References + +https://huggingface.co/cheaptrix/TokenizerTestingMTSUFall2024SoftwareEngineering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-tokenizertestingmtsufall2024softwareengineering_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-tokenizertestingmtsufall2024softwareengineering_pipeline_en.md new file mode 100644 index 00000000000000..d068ee128cac79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-tokenizertestingmtsufall2024softwareengineering_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English tokenizertestingmtsufall2024softwareengineering_pipeline pipeline T5Transformer from cheaptrix +author: John Snow Labs +name: tokenizertestingmtsufall2024softwareengineering_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tokenizertestingmtsufall2024softwareengineering_pipeline` is a English model originally trained by cheaptrix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tokenizertestingmtsufall2024softwareengineering_pipeline_en_5.5.1_3.0_1734327749937.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tokenizertestingmtsufall2024softwareengineering_pipeline_en_5.5.1_3.0_1734327749937.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("tokenizertestingmtsufall2024softwareengineering_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("tokenizertestingmtsufall2024softwareengineering_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tokenizertestingmtsufall2024softwareengineering_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|341.0 MB| + +## References + +https://huggingface.co/cheaptrix/TokenizerTestingMTSUFall2024SoftwareEngineering + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-training_df_fullctxt_filtered_0_15_biobertqa_en.md b/docs/_posts/ahmedlone127/2024-12-16-training_df_fullctxt_filtered_0_15_biobertqa_en.md new file mode 100644 index 00000000000000..12c6faf8ce0869 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-training_df_fullctxt_filtered_0_15_biobertqa_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English training_df_fullctxt_filtered_0_15_biobertqa BertForQuestionAnswering from LeWince +author: John Snow Labs +name: training_df_fullctxt_filtered_0_15_biobertqa +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`training_df_fullctxt_filtered_0_15_biobertqa` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/training_df_fullctxt_filtered_0_15_biobertqa_en_5.5.1_3.0_1734338391361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/training_df_fullctxt_filtered_0_15_biobertqa_en_5.5.1_3.0_1734338391361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("training_df_fullctxt_filtered_0_15_biobertqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("training_df_fullctxt_filtered_0_15_biobertqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|training_df_fullctxt_filtered_0_15_biobertqa| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/LeWince/training_df_fullctxt_filtered_0_15_BioBertQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-translate_english_tonga_tonga_islands_turkish_pipeline_tr.md b/docs/_posts/ahmedlone127/2024-12-16-translate_english_tonga_tonga_islands_turkish_pipeline_tr.md new file mode 100644 index 00000000000000..9d6ca8856d23ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-translate_english_tonga_tonga_islands_turkish_pipeline_tr.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Turkish translate_english_tonga_tonga_islands_turkish_pipeline pipeline T5Transformer from suayptalha +author: John Snow Labs +name: translate_english_tonga_tonga_islands_turkish_pipeline +date: 2024-12-16 +tags: [tr, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`translate_english_tonga_tonga_islands_turkish_pipeline` is a Turkish model originally trained by suayptalha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/translate_english_tonga_tonga_islands_turkish_pipeline_tr_5.5.1_3.0_1734327293458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/translate_english_tonga_tonga_islands_turkish_pipeline_tr_5.5.1_3.0_1734327293458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("translate_english_tonga_tonga_islands_turkish_pipeline", lang = "tr") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("translate_english_tonga_tonga_islands_turkish_pipeline", lang = "tr") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|translate_english_tonga_tonga_islands_turkish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|tr| +|Size:|1.0 GB| + +## References + +https://huggingface.co/suayptalha/Translate-EN-to-TR + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-translate_english_tonga_tonga_islands_turkish_tr.md b/docs/_posts/ahmedlone127/2024-12-16-translate_english_tonga_tonga_islands_turkish_tr.md new file mode 100644 index 00000000000000..6b35271fbc2fff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-translate_english_tonga_tonga_islands_turkish_tr.md @@ -0,0 +1,86 @@ +--- +layout: model +title: Turkish translate_english_tonga_tonga_islands_turkish T5Transformer from suayptalha +author: John Snow Labs +name: translate_english_tonga_tonga_islands_turkish +date: 2024-12-16 +tags: [tr, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: tr +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`translate_english_tonga_tonga_islands_turkish` is a Turkish model originally trained by suayptalha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/translate_english_tonga_tonga_islands_turkish_tr_5.5.1_3.0_1734327243248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/translate_english_tonga_tonga_islands_turkish_tr_5.5.1_3.0_1734327243248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("translate_english_tonga_tonga_islands_turkish","tr") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("translate_english_tonga_tonga_islands_turkish", "tr") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|translate_english_tonga_tonga_islands_turkish| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|tr| +|Size:|1.0 GB| + +## References + +https://huggingface.co/suayptalha/Translate-EN-to-TR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-translation_model_zxdexpo_en.md b/docs/_posts/ahmedlone127/2024-12-16-translation_model_zxdexpo_en.md new file mode 100644 index 00000000000000..987e77d331b47d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-translation_model_zxdexpo_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English translation_model_zxdexpo T5Transformer from zxdexpo +author: John Snow Labs +name: translation_model_zxdexpo +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`translation_model_zxdexpo` is a English model originally trained by zxdexpo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/translation_model_zxdexpo_en_5.5.1_3.0_1734331634244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/translation_model_zxdexpo_en_5.5.1_3.0_1734331634244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("translation_model_zxdexpo","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("translation_model_zxdexpo", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|translation_model_zxdexpo| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|336.3 MB| + +## References + +https://huggingface.co/zxdexpo/translation_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-translation_model_zxdexpo_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-translation_model_zxdexpo_pipeline_en.md new file mode 100644 index 00000000000000..aa41e00968b990 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-translation_model_zxdexpo_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English translation_model_zxdexpo_pipeline pipeline T5Transformer from zxdexpo +author: John Snow Labs +name: translation_model_zxdexpo_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`translation_model_zxdexpo_pipeline` is a English model originally trained by zxdexpo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/translation_model_zxdexpo_pipeline_en_5.5.1_3.0_1734331656663.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/translation_model_zxdexpo_pipeline_en_5.5.1_3.0_1734331656663.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("translation_model_zxdexpo_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("translation_model_zxdexpo_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|translation_model_zxdexpo_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|336.3 MB| + +## References + +https://huggingface.co/zxdexpo/translation_model + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-trocr_base_handwritten_swe_pipeline_sv.md b/docs/_posts/ahmedlone127/2024-12-16-trocr_base_handwritten_swe_pipeline_sv.md new file mode 100644 index 00000000000000..265b67af2a0371 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-trocr_base_handwritten_swe_pipeline_sv.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Swedish trocr_base_handwritten_swe_pipeline pipeline VisionEncoderDecoderForImageCaptioning from Riksarkivet +author: John Snow Labs +name: trocr_base_handwritten_swe_pipeline +date: 2024-12-16 +tags: [sv, open_source, pipeline, onnx] +task: Image Captioning +language: sv +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trocr_base_handwritten_swe_pipeline` is a Swedish model originally trained by Riksarkivet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trocr_base_handwritten_swe_pipeline_sv_5.5.1_3.0_1734317629462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trocr_base_handwritten_swe_pipeline_sv_5.5.1_3.0_1734317629462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trocr_base_handwritten_swe_pipeline", lang = "sv") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trocr_base_handwritten_swe_pipeline", lang = "sv") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trocr_base_handwritten_swe_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|sv| +|Size:|1.4 GB| + +## References + +https://huggingface.co/Riksarkivet/trocr-base-handwritten-swe + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-trocr_base_spanish_pipeline_es.md b/docs/_posts/ahmedlone127/2024-12-16-trocr_base_spanish_pipeline_es.md new file mode 100644 index 00000000000000..dd256b076f4eaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-trocr_base_spanish_pipeline_es.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Castilian, Spanish trocr_base_spanish_pipeline pipeline VisionEncoderDecoderForImageCaptioning from qantev +author: John Snow Labs +name: trocr_base_spanish_pipeline +date: 2024-12-16 +tags: [es, open_source, pipeline, onnx] +task: Image Captioning +language: es +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trocr_base_spanish_pipeline` is a Castilian, Spanish model originally trained by qantev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trocr_base_spanish_pipeline_es_5.5.1_3.0_1734319179816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trocr_base_spanish_pipeline_es_5.5.1_3.0_1734319179816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trocr_base_spanish_pipeline", lang = "es") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trocr_base_spanish_pipeline", lang = "es") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trocr_base_spanish_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|es| +|Size:|1.4 GB| + +## References + +https://huggingface.co/qantev/trocr-base-spanish + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-trocr_medieval_base_pipeline_la.md b/docs/_posts/ahmedlone127/2024-12-16-trocr_medieval_base_pipeline_la.md new file mode 100644 index 00000000000000..b3786e3dc47468 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-trocr_medieval_base_pipeline_la.md @@ -0,0 +1,69 @@ +--- +layout: model +title: Latin trocr_medieval_base_pipeline pipeline VisionEncoderDecoderForImageCaptioning from medieval-data +author: John Snow Labs +name: trocr_medieval_base_pipeline +date: 2024-12-16 +tags: [la, open_source, pipeline, onnx] +task: Image Captioning +language: la +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trocr_medieval_base_pipeline` is a Latin model originally trained by medieval-data. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trocr_medieval_base_pipeline_la_5.5.1_3.0_1734317143136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trocr_medieval_base_pipeline_la_5.5.1_3.0_1734317143136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trocr_medieval_base_pipeline", lang = "la") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trocr_medieval_base_pipeline", lang = "la") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trocr_medieval_base_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|la| +|Size:|1.4 GB| + +## References + +https://huggingface.co/medieval-data/trocr-medieval-base + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-trocr_medieval_latin_caroline_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-trocr_medieval_latin_caroline_pipeline_en.md new file mode 100644 index 00000000000000..3d7879ff029f74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-trocr_medieval_latin_caroline_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English trocr_medieval_latin_caroline_pipeline pipeline VisionEncoderDecoderForImageCaptioning from medieval-data +author: John Snow Labs +name: trocr_medieval_latin_caroline_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trocr_medieval_latin_caroline_pipeline` is a English model originally trained by medieval-data. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trocr_medieval_latin_caroline_pipeline_en_5.5.1_3.0_1734318245566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trocr_medieval_latin_caroline_pipeline_en_5.5.1_3.0_1734318245566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("trocr_medieval_latin_caroline_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("trocr_medieval_latin_caroline_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trocr_medieval_latin_caroline_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.4 GB| + +## References + +https://huggingface.co/medieval-data/trocr-medieval-latin-caroline + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-v1_en.md b/docs/_posts/ahmedlone127/2024-12-16-v1_en.md new file mode 100644 index 00000000000000..5d5fc2db2f5632 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-v1_en.md @@ -0,0 +1,86 @@ +--- +layout: model +title: English v1 T5Transformer from dmen24 +author: John Snow Labs +name: v1 +date: 2024-12-16 +tags: [en, open_source, onnx, t5, question_answering, summarization, translation, text_generation] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`v1` is a English model originally trained by dmen24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/v1_en_5.5.1_3.0_1734331113954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/v1_en_5.5.1_3.0_1734331113954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +t5 = T5Transformer.pretrained("v1","en") \ + .setInputCols(["document"]) \ + .setOutputCol("output") + +pipeline = Pipeline().setStages([documentAssembler, t5]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val t5 = T5Transformer.pretrained("v1", "en") + .setInputCols(Array("documents")) + .setOutputCol("output") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, t5)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|v1| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|330.5 MB| + +## References + +https://huggingface.co/dmen24/V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-v1_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-v1_pipeline_en.md new file mode 100644 index 00000000000000..243c0d67f242f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-v1_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English v1_pipeline pipeline T5Transformer from dmen24 +author: John Snow Labs +name: v1_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: [Question Answering, Summarization, Translation, Text Generation] +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained T5Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`v1_pipeline` is a English model originally trained by dmen24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/v1_pipeline_en_5.5.1_3.0_1734331134471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/v1_pipeline_en_5.5.1_3.0_1734331134471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("v1_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("v1_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|v1_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|330.5 MB| + +## References + +https://huggingface.co/dmen24/V1 + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_image_captioning_baseplate_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_image_captioning_baseplate_pipeline_en.md new file mode 100644 index 00000000000000..574bcd1c91e855 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_image_captioning_baseplate_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English vit_gpt2_image_captioning_baseplate_pipeline pipeline VisionEncoderDecoderForImageCaptioning from baseplate +author: John Snow Labs +name: vit_gpt2_image_captioning_baseplate_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vit_gpt2_image_captioning_baseplate_pipeline` is a English model originally trained by baseplate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vit_gpt2_image_captioning_baseplate_pipeline_en_5.5.1_3.0_1734317002973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vit_gpt2_image_captioning_baseplate_pipeline_en_5.5.1_3.0_1734317002973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("vit_gpt2_image_captioning_baseplate_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("vit_gpt2_image_captioning_baseplate_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vit_gpt2_image_captioning_baseplate_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/baseplate/vit-gpt2-image-captioning + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_stablediffusion2_lora_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_stablediffusion2_lora_pipeline_en.md new file mode 100644 index 00000000000000..8a528bc332339f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_stablediffusion2_lora_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English vit_gpt2_stablediffusion2_lora_pipeline pipeline VisionEncoderDecoderForImageCaptioning from nttdataspain +author: John Snow Labs +name: vit_gpt2_stablediffusion2_lora_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vit_gpt2_stablediffusion2_lora_pipeline` is a English model originally trained by nttdataspain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vit_gpt2_stablediffusion2_lora_pipeline_en_5.5.1_3.0_1734317368837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vit_gpt2_stablediffusion2_lora_pipeline_en_5.5.1_3.0_1734317368837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("vit_gpt2_stablediffusion2_lora_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("vit_gpt2_stablediffusion2_lora_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vit_gpt2_stablediffusion2_lora_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/nttdataspain/vit-gpt2-stablediffusion2-lora + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_verifycode_caption_airis_channel_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_verifycode_caption_airis_channel_pipeline_en.md new file mode 100644 index 00000000000000..558ca75e07fcaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-vit_gpt2_verifycode_caption_airis_channel_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English vit_gpt2_verifycode_caption_airis_channel_pipeline pipeline VisionEncoderDecoderForImageCaptioning from AIris-Channel +author: John Snow Labs +name: vit_gpt2_verifycode_caption_airis_channel_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vit_gpt2_verifycode_caption_airis_channel_pipeline` is a English model originally trained by AIris-Channel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vit_gpt2_verifycode_caption_airis_channel_pipeline_en_5.5.1_3.0_1734319245533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vit_gpt2_verifycode_caption_airis_channel_pipeline_en_5.5.1_3.0_1734319245533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("vit_gpt2_verifycode_caption_airis_channel_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("vit_gpt2_verifycode_caption_airis_channel_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vit_gpt2_verifycode_caption_airis_channel_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/AIris-Channel/vit-gpt2-verifycode-caption + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-vit_swin_base_224_gpt2_image_captioning_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-vit_swin_base_224_gpt2_image_captioning_pipeline_en.md new file mode 100644 index 00000000000000..f29f3a1db87eb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-vit_swin_base_224_gpt2_image_captioning_pipeline_en.md @@ -0,0 +1,69 @@ +--- +layout: model +title: English vit_swin_base_224_gpt2_image_captioning_pipeline pipeline VisionEncoderDecoderForImageCaptioning from Abdou +author: John Snow Labs +name: vit_swin_base_224_gpt2_image_captioning_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Image Captioning +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained VisionEncoderDecoderForImageCaptioning, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vit_swin_base_224_gpt2_image_captioning_pipeline` is a English model originally trained by Abdou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vit_swin_base_224_gpt2_image_captioning_pipeline_en_5.5.1_3.0_1734317712106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vit_swin_base_224_gpt2_image_captioning_pipeline_en_5.5.1_3.0_1734317712106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("vit_swin_base_224_gpt2_image_captioning_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("vit_swin_base_224_gpt2_image_captioning_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vit_swin_base_224_gpt2_image_captioning_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|1.0 GB| + +## References + +https://huggingface.co/Abdou/vit-swin-base-224-gpt2-image-captioning + +## Included Models + +- ImageAssembler +- VisionEncoderDecoderForImageCaptioning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_ner_alexbeta80_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_ner_alexbeta80_en.md new file mode 100644 index 00000000000000..d55538b0b49af3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_ner_alexbeta80_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_ner_alexbeta80 XlmRoBertaForTokenClassification from alexbeta80 +author: John Snow Labs +name: xlm_roberta_base_finetuned_ner_alexbeta80 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_ner_alexbeta80` is a English model originally trained by alexbeta80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_ner_alexbeta80_en_5.5.1_3.0_1734322409766.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_ner_alexbeta80_en_5.5.1_3.0_1734322409766.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_ner_alexbeta80","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_ner_alexbeta80", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_ner_alexbeta80| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|806.7 MB| + +## References + +https://huggingface.co/alexbeta80/xlm-roberta-base-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_ner_alexbeta80_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_ner_alexbeta80_pipeline_en.md new file mode 100644 index 00000000000000..f145f662b81545 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_ner_alexbeta80_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_ner_alexbeta80_pipeline pipeline XlmRoBertaForTokenClassification from alexbeta80 +author: John Snow Labs +name: xlm_roberta_base_finetuned_ner_alexbeta80_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_ner_alexbeta80_pipeline` is a English model originally trained by alexbeta80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_ner_alexbeta80_pipeline_en_5.5.1_3.0_1734322523891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_ner_alexbeta80_pipeline_en_5.5.1_3.0_1734322523891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_ner_alexbeta80_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_ner_alexbeta80_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_ner_alexbeta80_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|806.7 MB| + +## References + +https://huggingface.co/alexbeta80/xlm-roberta-base-finetuned-ner + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_ashkanero_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_ashkanero_en.md new file mode 100644 index 00000000000000..7833d2004f5e2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_ashkanero_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_ashkanero XlmRoBertaForTokenClassification from Ashkanero +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_ashkanero +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_ashkanero` is a English model originally trained by Ashkanero. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_ashkanero_en_5.5.1_3.0_1734322306845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_ashkanero_en_5.5.1_3.0_1734322306845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_ashkanero","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_ashkanero", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_ashkanero| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|848.0 MB| + +## References + +https://huggingface.co/Ashkanero/xlm-roberta-base-finetuned-panx-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline_en.md new file mode 100644 index 00000000000000..21d4e02f266cee --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline pipeline XlmRoBertaForTokenClassification from Ashkanero +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline` is a English model originally trained by Ashkanero. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline_en_5.5.1_3.0_1734322388314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline_en_5.5.1_3.0_1734322388314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_ashkanero_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|848.0 MB| + +## References + +https://huggingface.co/Ashkanero/xlm-roberta-base-finetuned-panx-all + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_francois2511_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_francois2511_en.md new file mode 100644 index 00000000000000..a3cb809768da5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_francois2511_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_francois2511 XlmRoBertaForTokenClassification from Francois2511 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_francois2511 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_francois2511` is a English model originally trained by Francois2511. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_francois2511_en_5.5.1_3.0_1734323881210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_francois2511_en_5.5.1_3.0_1734323881210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_francois2511","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_francois2511", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_francois2511| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|848.0 MB| + +## References + +https://huggingface.co/Francois2511/xlm-roberta-base-finetuned-panx-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_francois2511_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_francois2511_pipeline_en.md new file mode 100644 index 00000000000000..28782e8fab1bfd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_francois2511_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_francois2511_pipeline pipeline XlmRoBertaForTokenClassification from Francois2511 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_francois2511_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_francois2511_pipeline` is a English model originally trained by Francois2511. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_francois2511_pipeline_en_5.5.1_3.0_1734323962543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_francois2511_pipeline_en_5.5.1_3.0_1734323962543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_francois2511_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_francois2511_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_francois2511_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|848.0 MB| + +## References + +https://huggingface.co/Francois2511/xlm-roberta-base-finetuned-panx-all + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_wndlek3_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_wndlek3_en.md new file mode 100644 index 00000000000000..28e518042ff479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_wndlek3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_wndlek3 XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_wndlek3 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_wndlek3` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_wndlek3_en_5.5.1_3.0_1734322667637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_wndlek3_en_5.5.1_3.0_1734322667637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_wndlek3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_all_wndlek3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_wndlek3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|848.0 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline_en.md new file mode 100644 index 00000000000000..8bfdddb5906e7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline pipeline XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline_en_5.5.1_3.0_1734322749913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline_en_5.5.1_3.0_1734322749913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_all_wndlek3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|848.0 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-all + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_krish2218_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_krish2218_en.md new file mode 100644 index 00000000000000..ad90ca5a5b05d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_krish2218_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_english_krish2218 XlmRoBertaForTokenClassification from Krish2218 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_english_krish2218 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_english_krish2218` is a English model originally trained by Krish2218. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_krish2218_en_5.5.1_3.0_1734324119252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_krish2218_en_5.5.1_3.0_1734324119252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_english_krish2218","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_english_krish2218", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_english_krish2218| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|814.3 MB| + +## References + +https://huggingface.co/Krish2218/xlm-roberta-base-finetuned-panx-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_krish2218_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_krish2218_pipeline_en.md new file mode 100644 index 00000000000000..32d30bd697eb76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_krish2218_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_english_krish2218_pipeline pipeline XlmRoBertaForTokenClassification from Krish2218 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_english_krish2218_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_english_krish2218_pipeline` is a English model originally trained by Krish2218. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_krish2218_pipeline_en_5.5.1_3.0_1734324229922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_krish2218_pipeline_en_5.5.1_3.0_1734324229922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_english_krish2218_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_english_krish2218_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_english_krish2218_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|814.3 MB| + +## References + +https://huggingface.co/Krish2218/xlm-roberta-base-finetuned-panx-en + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_wndlek3_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_wndlek3_en.md new file mode 100644 index 00000000000000..b85c0ffa185bf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_wndlek3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_english_wndlek3 XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_english_wndlek3 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_english_wndlek3` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_wndlek3_en_5.5.1_3.0_1734323563994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_wndlek3_en_5.5.1_3.0_1734323563994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_english_wndlek3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_english_wndlek3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_english_wndlek3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|814.3 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline_en.md new file mode 100644 index 00000000000000..e0c171fae48c2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline pipeline XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline_en_5.5.1_3.0_1734323671371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline_en_5.5.1_3.0_1734323671371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_english_wndlek3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|814.3 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-en + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_dasooo_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_dasooo_en.md new file mode 100644 index 00000000000000..ebcde797051557 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_dasooo_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_dasooo XlmRoBertaForTokenClassification from daSooo +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_dasooo +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_dasooo` is a English model originally trained by daSooo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_dasooo_en_5.5.1_3.0_1734321633997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_dasooo_en_5.5.1_3.0_1734321633997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_dasooo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_dasooo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_dasooo| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/daSooo/xlm-roberta-base-finetuned-panx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_dasooo_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_dasooo_pipeline_en.md new file mode 100644 index 00000000000000..9924b50a66e160 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_dasooo_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_dasooo_pipeline pipeline XlmRoBertaForTokenClassification from daSooo +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_dasooo_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_dasooo_pipeline` is a English model originally trained by daSooo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_dasooo_pipeline_en_5.5.1_3.0_1734321729885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_dasooo_pipeline_en_5.5.1_3.0_1734321729885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_dasooo_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_dasooo_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_dasooo_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/daSooo/xlm-roberta-base-finetuned-panx-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_leotunganh_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_leotunganh_en.md new file mode 100644 index 00000000000000..075c36177a9ae6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_leotunganh_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_leotunganh XlmRoBertaForTokenClassification from LeoTungAnh +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_leotunganh +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_leotunganh` is a English model originally trained by LeoTungAnh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_leotunganh_en_5.5.1_3.0_1734322555915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_leotunganh_en_5.5.1_3.0_1734322555915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_leotunganh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_leotunganh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_leotunganh| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|821.4 MB| + +## References + +https://huggingface.co/LeoTungAnh/xlm-roberta-base-finetuned-panx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline_en.md new file mode 100644 index 00000000000000..402fc585c3e23f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline pipeline XlmRoBertaForTokenClassification from LeoTungAnh +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline` is a English model originally trained by LeoTungAnh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline_en_5.5.1_3.0_1734322658853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline_en_5.5.1_3.0_1734322658853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_leotunganh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|821.4 MB| + +## References + +https://huggingface.co/LeoTungAnh/xlm-roberta-base-finetuned-panx-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_lsh231_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_lsh231_en.md new file mode 100644 index 00000000000000..d8b1fcf4e869c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_lsh231_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_lsh231 XlmRoBertaForTokenClassification from lsh231 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_lsh231 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_lsh231` is a English model originally trained by lsh231. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_lsh231_en_5.5.1_3.0_1734321960006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_lsh231_en_5.5.1_3.0_1734321960006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_lsh231","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_lsh231", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_lsh231| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/lsh231/xlm-roberta-base-finetuned-panx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_lsh231_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_lsh231_pipeline_en.md new file mode 100644 index 00000000000000..9cc20e78b495d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_lsh231_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_lsh231_pipeline pipeline XlmRoBertaForTokenClassification from lsh231 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_lsh231_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_lsh231_pipeline` is a English model originally trained by lsh231. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_lsh231_pipeline_en_5.5.1_3.0_1734322050771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_lsh231_pipeline_en_5.5.1_3.0_1734322050771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_lsh231_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_lsh231_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_lsh231_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/lsh231/xlm-roberta-base-finetuned-panx-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_mrwetsnow_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_mrwetsnow_en.md new file mode 100644 index 00000000000000..87d99c90be3834 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_mrwetsnow_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_mrwetsnow XlmRoBertaForTokenClassification from MrWetsnow +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_mrwetsnow +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_mrwetsnow` is a English model originally trained by MrWetsnow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_mrwetsnow_en_5.5.1_3.0_1734323691108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_mrwetsnow_en_5.5.1_3.0_1734323691108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_mrwetsnow","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_mrwetsnow", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_mrwetsnow| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/MrWetsnow/xlm-roberta-base-finetuned-panx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline_en.md new file mode 100644 index 00000000000000..e0edbe685f5154 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline pipeline XlmRoBertaForTokenClassification from MrWetsnow +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline` is a English model originally trained by MrWetsnow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline_en_5.5.1_3.0_1734323784371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline_en_5.5.1_3.0_1734323784371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_mrwetsnow_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/MrWetsnow/xlm-roberta-base-finetuned-panx-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_ultimecia_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_ultimecia_en.md new file mode 100644 index 00000000000000..e04635634ccd8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_ultimecia_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_ultimecia XlmRoBertaForTokenClassification from ultimecia +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_ultimecia +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_ultimecia` is a English model originally trained by ultimecia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_ultimecia_en_5.5.1_3.0_1734324104973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_ultimecia_en_5.5.1_3.0_1734324104973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_ultimecia","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_ultimecia", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_ultimecia| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|821.4 MB| + +## References + +https://huggingface.co/ultimecia/xlm-roberta-base-finetuned-panx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline_en.md new file mode 100644 index 00000000000000..abb52f2bc6ef0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline pipeline XlmRoBertaForTokenClassification from ultimecia +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline` is a English model originally trained by ultimecia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline_en_5.5.1_3.0_1734324208495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline_en_5.5.1_3.0_1734324208495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_ultimecia_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|821.4 MB| + +## References + +https://huggingface.co/ultimecia/xlm-roberta-base-finetuned-panx-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_wndlek3_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_wndlek3_en.md new file mode 100644 index 00000000000000..d33a4f480ce82d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_wndlek3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_wndlek3 XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_wndlek3 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_wndlek3` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_wndlek3_en_5.5.1_3.0_1734322151170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_wndlek3_en_5.5.1_3.0_1734322151170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_wndlek3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_french_wndlek3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_wndlek3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline_en.md new file mode 100644 index 00000000000000..f066f97f0f8a48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline pipeline XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline_en_5.5.1_3.0_1734322240869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline_en_5.5.1_3.0_1734322240869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_french_wndlek3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|827.9 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_1mind_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_1mind_en.md new file mode 100644 index 00000000000000..8975515df9e5d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_1mind_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_1mind XlmRoBertaForTokenClassification from 1mind +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_1mind +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_1mind` is a English model originally trained by 1mind. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_1mind_en_5.5.1_3.0_1734324491646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_1mind_en_5.5.1_3.0_1734324491646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_1mind","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_1mind", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_1mind| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|841.2 MB| + +## References + +https://huggingface.co/1mind/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_1mind_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_1mind_pipeline_en.md new file mode 100644 index 00000000000000..ad50c870802d60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_1mind_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_1mind_pipeline pipeline XlmRoBertaForTokenClassification from 1mind +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_1mind_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_1mind_pipeline` is a English model originally trained by 1mind. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_1mind_pipeline_en_5.5.1_3.0_1734324574243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_1mind_pipeline_en_5.5.1_3.0_1734324574243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_1mind_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_1mind_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_1mind_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|841.2 MB| + +## References + +https://huggingface.co/1mind/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_penguinman73_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_penguinman73_en.md new file mode 100644 index 00000000000000..042b49e80e9292 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_penguinman73_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_french_penguinman73 XlmRoBertaForTokenClassification from penguinman73 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_french_penguinman73 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_french_penguinman73` is a English model originally trained by penguinman73. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_penguinman73_en_5.5.1_3.0_1734323508488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_penguinman73_en_5.5.1_3.0_1734323508488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_french_penguinman73","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_french_penguinman73", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_french_penguinman73| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|843.4 MB| + +## References + +https://huggingface.co/penguinman73/xlm-roberta-base-finetuned-panx-de-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline_en.md new file mode 100644 index 00000000000000..36c8c7ea72cef5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline pipeline XlmRoBertaForTokenClassification from penguinman73 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline` is a English model originally trained by penguinman73. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline_en_5.5.1_3.0_1734323594632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline_en_5.5.1_3.0_1734323594632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_french_penguinman73_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|843.4 MB| + +## References + +https://huggingface.co/penguinman73/xlm-roberta-base-finetuned-panx-de-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_scionk_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_scionk_en.md new file mode 100644 index 00000000000000..30c7f3428d94a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_scionk_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_french_scionk XlmRoBertaForTokenClassification from scionk +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_french_scionk +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_french_scionk` is a English model originally trained by scionk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_scionk_en_5.5.1_3.0_1734323443129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_scionk_en_5.5.1_3.0_1734323443129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_french_scionk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_french_scionk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_french_scionk| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|843.4 MB| + +## References + +https://huggingface.co/scionk/xlm-roberta-base-finetuned-panx-de-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline_en.md new file mode 100644 index 00000000000000..bb8371b8ca1768 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline pipeline XlmRoBertaForTokenClassification from scionk +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline` is a English model originally trained by scionk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline_en_5.5.1_3.0_1734323528798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline_en_5.5.1_3.0_1734323528798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_french_scionk_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|843.4 MB| + +## References + +https://huggingface.co/scionk/xlm-roberta-base-finetuned-panx-de-fr + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_goldsurfer_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_goldsurfer_en.md new file mode 100644 index 00000000000000..fc0fecdd1af6e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_goldsurfer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_goldsurfer XlmRoBertaForTokenClassification from GoldSurfer +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_goldsurfer +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_goldsurfer` is a English model originally trained by GoldSurfer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_goldsurfer_en_5.5.1_3.0_1734321230699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_goldsurfer_en_5.5.1_3.0_1734321230699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_goldsurfer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_goldsurfer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_goldsurfer| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|853.8 MB| + +## References + +https://huggingface.co/GoldSurfer/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline_en.md new file mode 100644 index 00000000000000..46d33de9bb6edf --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline pipeline XlmRoBertaForTokenClassification from GoldSurfer +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline` is a English model originally trained by GoldSurfer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline_en_5.5.1_3.0_1734321298462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline_en_5.5.1_3.0_1734321298462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_goldsurfer_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|853.8 MB| + +## References + +https://huggingface.co/GoldSurfer/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_jojeyh_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_jojeyh_en.md new file mode 100644 index 00000000000000..ac5a734c4d5795 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_jojeyh_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_jojeyh XlmRoBertaForTokenClassification from jojeyh +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_jojeyh +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_jojeyh` is a English model originally trained by jojeyh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_jojeyh_en_5.5.1_3.0_1734324193130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_jojeyh_en_5.5.1_3.0_1734324193130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_jojeyh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_jojeyh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_jojeyh| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/jojeyh/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline_en.md new file mode 100644 index 00000000000000..31942d95c6d7b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline pipeline XlmRoBertaForTokenClassification from jojeyh +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline` is a English model originally trained by jojeyh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline_en_5.5.1_3.0_1734324276840.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline_en_5.5.1_3.0_1734324276840.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_jojeyh_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/jojeyh/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_junejae_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_junejae_en.md new file mode 100644 index 00000000000000..ea32b3b1d5a7ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_junejae_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_junejae XlmRoBertaForTokenClassification from junejae +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_junejae +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_junejae` is a English model originally trained by junejae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_junejae_en_5.5.1_3.0_1734322022809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_junejae_en_5.5.1_3.0_1734322022809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_junejae","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_junejae", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_junejae| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|854.4 MB| + +## References + +https://huggingface.co/junejae/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_junejae_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_junejae_pipeline_en.md new file mode 100644 index 00000000000000..9abe8b05c9f3fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_junejae_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_junejae_pipeline pipeline XlmRoBertaForTokenClassification from junejae +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_junejae_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_junejae_pipeline` is a English model originally trained by junejae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_junejae_pipeline_en_5.5.1_3.0_1734322086029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_junejae_pipeline_en_5.5.1_3.0_1734322086029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_junejae_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_junejae_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_junejae_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|854.4 MB| + +## References + +https://huggingface.co/junejae/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_kikim6114_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_kikim6114_en.md new file mode 100644 index 00000000000000..367c8a8a850383 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_kikim6114_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_kikim6114 XlmRoBertaForTokenClassification from kikim6114 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_kikim6114 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_kikim6114` is a English model originally trained by kikim6114. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_kikim6114_en_5.5.1_3.0_1734322848119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_kikim6114_en_5.5.1_3.0_1734322848119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_kikim6114","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_kikim6114", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_kikim6114| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/kikim6114/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lsh231_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lsh231_en.md new file mode 100644 index 00000000000000..f67219edaa5638 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lsh231_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_lsh231 XlmRoBertaForTokenClassification from lsh231 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_lsh231 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_lsh231` is a English model originally trained by lsh231. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lsh231_en_5.5.1_3.0_1734321588064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lsh231_en_5.5.1_3.0_1734321588064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_lsh231","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_lsh231", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_lsh231| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/lsh231/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lsh231_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lsh231_pipeline_en.md new file mode 100644 index 00000000000000..340b22f0998b3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lsh231_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_lsh231_pipeline pipeline XlmRoBertaForTokenClassification from lsh231 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_lsh231_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_lsh231_pipeline` is a English model originally trained by lsh231. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lsh231_pipeline_en_5.5.1_3.0_1734321672163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lsh231_pipeline_en_5.5.1_3.0_1734321672163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_lsh231_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_lsh231_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_lsh231_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/lsh231/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lur601_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lur601_en.md new file mode 100644 index 00000000000000..a5fc0855ffb250 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lur601_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_lur601 XlmRoBertaForTokenClassification from lur601 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_lur601 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_lur601` is a English model originally trained by lur601. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lur601_en_5.5.1_3.0_1734323379071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lur601_en_5.5.1_3.0_1734323379071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_lur601","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_lur601", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_lur601| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|652.9 MB| + +## References + +https://huggingface.co/lur601/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lur601_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lur601_pipeline_en.md new file mode 100644 index 00000000000000..676a7e5e3b3d08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_lur601_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_lur601_pipeline pipeline XlmRoBertaForTokenClassification from lur601 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_lur601_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_lur601_pipeline` is a English model originally trained by lur601. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lur601_pipeline_en_5.5.1_3.0_1734323560725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_lur601_pipeline_en_5.5.1_3.0_1734323560725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_lur601_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_lur601_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_lur601_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|653.0 MB| + +## References + +https://huggingface.co/lur601/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_maarten1953_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_maarten1953_en.md new file mode 100644 index 00000000000000..d6eebb92c95b25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_maarten1953_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_maarten1953 XlmRoBertaForTokenClassification from Maarten1953 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_maarten1953 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_maarten1953` is a English model originally trained by Maarten1953. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_maarten1953_en_5.5.1_3.0_1734323780670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_maarten1953_en_5.5.1_3.0_1734323780670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_maarten1953","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_maarten1953", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_maarten1953| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/Maarten1953/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline_en.md new file mode 100644 index 00000000000000..d4b85e2d7f6b67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline pipeline XlmRoBertaForTokenClassification from Maarten1953 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline` is a English model originally trained by Maarten1953. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline_en_5.5.1_3.0_1734323872205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline_en_5.5.1_3.0_1734323872205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_maarten1953_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/Maarten1953/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mcguiver_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mcguiver_en.md new file mode 100644 index 00000000000000..318f9ea17a6b8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mcguiver_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_mcguiver XlmRoBertaForTokenClassification from mcguiver +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_mcguiver +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_mcguiver` is a English model originally trained by mcguiver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mcguiver_en_5.5.1_3.0_1734321448464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mcguiver_en_5.5.1_3.0_1734321448464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_mcguiver","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_mcguiver", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_mcguiver| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/mcguiver/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline_en.md new file mode 100644 index 00000000000000..1b13f26d72e567 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline pipeline XlmRoBertaForTokenClassification from mcguiver +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline` is a English model originally trained by mcguiver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline_en_5.5.1_3.0_1734321536184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline_en_5.5.1_3.0_1734321536184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_mcguiver_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/mcguiver/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mealduct_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mealduct_en.md new file mode 100644 index 00000000000000..d1dee39d215cd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mealduct_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_mealduct XlmRoBertaForTokenClassification from MealDuct +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_mealduct +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_mealduct` is a English model originally trained by MealDuct. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mealduct_en_5.5.1_3.0_1734323136031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mealduct_en_5.5.1_3.0_1734323136031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_mealduct","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_mealduct", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_mealduct| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/MealDuct/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mealduct_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mealduct_pipeline_en.md new file mode 100644 index 00000000000000..78630cd03deac0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_mealduct_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_mealduct_pipeline pipeline XlmRoBertaForTokenClassification from MealDuct +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_mealduct_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_mealduct_pipeline` is a English model originally trained by MealDuct. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mealduct_pipeline_en_5.5.1_3.0_1734323220882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_mealduct_pipeline_en_5.5.1_3.0_1734323220882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_mealduct_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_mealduct_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_mealduct_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/MealDuct/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_nhduc1993_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_nhduc1993_en.md new file mode 100644 index 00000000000000..66ee42e678dcc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_nhduc1993_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_nhduc1993 XlmRoBertaForTokenClassification from nhduc1993 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_nhduc1993 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_nhduc1993` is a English model originally trained by nhduc1993. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_nhduc1993_en_5.5.1_3.0_1734321443463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_nhduc1993_en_5.5.1_3.0_1734321443463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_nhduc1993","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_nhduc1993", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_nhduc1993| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/nhduc1993/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline_en.md new file mode 100644 index 00000000000000..1367873e2ccb08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline pipeline XlmRoBertaForTokenClassification from nhduc1993 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline` is a English model originally trained by nhduc1993. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline_en_5.5.1_3.0_1734321531370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline_en_5.5.1_3.0_1734321531370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_nhduc1993_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/nhduc1993/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_rosepasta_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_rosepasta_en.md new file mode 100644 index 00000000000000..1bc24a11bc892a --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_rosepasta_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_rosepasta XlmRoBertaForTokenClassification from RosePasta +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_rosepasta +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_rosepasta` is a English model originally trained by RosePasta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_rosepasta_en_5.5.1_3.0_1734322443484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_rosepasta_en_5.5.1_3.0_1734322443484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_rosepasta","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_rosepasta", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_rosepasta| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|841.2 MB| + +## References + +https://huggingface.co/RosePasta/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline_en.md new file mode 100644 index 00000000000000..68652abe6ceda1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline pipeline XlmRoBertaForTokenClassification from RosePasta +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline` is a English model originally trained by RosePasta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline_en_5.5.1_3.0_1734322528431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline_en_5.5.1_3.0_1734322528431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_rosepasta_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|841.2 MB| + +## References + +https://huggingface.co/RosePasta/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_shiou0601_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_shiou0601_en.md new file mode 100644 index 00000000000000..f6d8ddcf699bc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_shiou0601_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_shiou0601 XlmRoBertaForTokenClassification from Shiou0601 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_shiou0601 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_shiou0601` is a English model originally trained by Shiou0601. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_shiou0601_en_5.5.1_3.0_1734321247499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_shiou0601_en_5.5.1_3.0_1734321247499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_shiou0601","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_shiou0601", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_shiou0601| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/Shiou0601/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline_en.md new file mode 100644 index 00000000000000..c53930719a836f --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline pipeline XlmRoBertaForTokenClassification from Shiou0601 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline` is a English model originally trained by Shiou0601. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline_en_5.5.1_3.0_1734321332218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline_en_5.5.1_3.0_1734321332218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_shiou0601_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/Shiou0601/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_ultimecia_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_ultimecia_en.md new file mode 100644 index 00000000000000..19d64cdf263125 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_ultimecia_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_ultimecia XlmRoBertaForTokenClassification from ultimecia +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_ultimecia +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_ultimecia` is a English model originally trained by ultimecia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_ultimecia_en_5.5.1_3.0_1734321835659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_ultimecia_en_5.5.1_3.0_1734321835659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_ultimecia","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_ultimecia", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_ultimecia| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|835.3 MB| + +## References + +https://huggingface.co/ultimecia/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_wendao_123_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_wendao_123_en.md new file mode 100644 index 00000000000000..5b33a6fe69fc6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_wendao_123_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_wendao_123 XlmRoBertaForTokenClassification from Wendao-123 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_wendao_123 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_wendao_123` is a English model originally trained by Wendao-123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_wendao_123_en_5.5.1_3.0_1734324275309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_wendao_123_en_5.5.1_3.0_1734324275309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_wendao_123","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_german_wendao_123", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_wendao_123| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/Wendao-123/xlm-roberta-base-finetuned-panx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline_en.md new file mode 100644 index 00000000000000..9b10a3f91fde04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline pipeline XlmRoBertaForTokenClassification from Wendao-123 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline` is a English model originally trained by Wendao-123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline_en_5.5.1_3.0_1734324367944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline_en_5.5.1_3.0_1734324367944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_german_wendao_123_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|840.8 MB| + +## References + +https://huggingface.co/Wendao-123/xlm-roberta-base-finetuned-panx-de + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_leedongjae_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_leedongjae_en.md new file mode 100644 index 00000000000000..cc7522cc50cc7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_leedongjae_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_italian_leedongjae XlmRoBertaForTokenClassification from leedongjae +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_italian_leedongjae +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_italian_leedongjae` is a English model originally trained by leedongjae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_leedongjae_en_5.5.1_3.0_1734322829379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_leedongjae_en_5.5.1_3.0_1734322829379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_italian_leedongjae","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_italian_leedongjae", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_italian_leedongjae| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|816.7 MB| + +## References + +https://huggingface.co/leedongjae/xlm-roberta-base-finetuned-panx-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline_en.md new file mode 100644 index 00000000000000..6d6eaf1e929e41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline pipeline XlmRoBertaForTokenClassification from leedongjae +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline` is a English model originally trained by leedongjae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline_en_5.5.1_3.0_1734322928500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline_en_5.5.1_3.0_1734322928500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_italian_leedongjae_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|816.8 MB| + +## References + +https://huggingface.co/leedongjae/xlm-roberta-base-finetuned-panx-it + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_wndlek3_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_wndlek3_en.md new file mode 100644 index 00000000000000..47565c2c64032d --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_wndlek3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_italian_wndlek3 XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_italian_wndlek3 +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_italian_wndlek3` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_wndlek3_en_5.5.1_3.0_1734323029013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_wndlek3_en_5.5.1_3.0_1734323029013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_italian_wndlek3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_panx_italian_wndlek3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_italian_wndlek3| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|816.7 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline_en.md new file mode 100644 index 00000000000000..f0bbdb5adcc2bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline pipeline XlmRoBertaForTokenClassification from wndlek3 +author: John Snow Labs +name: xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline` is a English model originally trained by wndlek3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline_en_5.5.1_3.0_1734323126143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline_en_5.5.1_3.0_1734323126143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_panx_italian_wndlek3_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|816.8 MB| + +## References + +https://huggingface.co/wndlek3/xlm-roberta-base-finetuned-panx-it + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_wol_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_wol_en.md new file mode 100644 index 00000000000000..c85951f6d1aa92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_wol_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_wol XlmRoBertaForTokenClassification from vonewman +author: John Snow Labs +name: xlm_roberta_base_finetuned_wol +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, xlm_roberta, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: XlmRoBertaForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_wol` is a English model originally trained by vonewman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_wol_en_5.5.1_3.0_1734323053302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_wol_en_5.5.1_3.0_1734323053302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_wol","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = XlmRoBertaForTokenClassification.pretrained("xlm_roberta_base_finetuned_wol", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_wol| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|817.6 MB| + +## References + +https://huggingface.co/vonewman/xlm-roberta-base-finetuned-wol \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_wol_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_wol_pipeline_en.md new file mode 100644 index 00000000000000..d543f5b0f93f20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-xlm_roberta_base_finetuned_wol_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English xlm_roberta_base_finetuned_wol_pipeline pipeline XlmRoBertaForTokenClassification from vonewman +author: John Snow Labs +name: xlm_roberta_base_finetuned_wol_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained XlmRoBertaForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xlm_roberta_base_finetuned_wol_pipeline` is a English model originally trained by vonewman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_wol_pipeline_en_5.5.1_3.0_1734323153881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_base_finetuned_wol_pipeline_en_5.5.1_3.0_1734323153881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("xlm_roberta_base_finetuned_wol_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("xlm_roberta_base_finetuned_wol_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xlm_roberta_base_finetuned_wol_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|817.6 MB| + +## References + +https://huggingface.co/vonewman/xlm-roberta-base-finetuned-wol + +## Included Models + +- DocumentAssembler +- TokenizerModel +- XlmRoBertaForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-yert_ner_en.md b/docs/_posts/ahmedlone127/2024-12-16-yert_ner_en.md new file mode 100644 index 00000000000000..0e92e5d4204cae --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-yert_ner_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English yert_ner BertForTokenClassification from Matthew-Lund +author: John Snow Labs +name: yert_ner +date: 2024-12-16 +tags: [en, open_source, onnx, token_classification, bert, ner] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yert_ner` is a English model originally trained by Matthew-Lund. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yert_ner_en_5.5.1_3.0_1734336685358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yert_ner_en_5.5.1_3.0_1734336685358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +documentAssembler = DocumentAssembler() \ + .setInputCol('text') \ + .setOutputCol('document') + +tokenizer = Tokenizer() \ + .setInputCols(['document']) \ + .setOutputCol('token') + +tokenClassifier = BertForTokenClassification.pretrained("yert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenizer, tokenClassifier]) +data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) + +``` +```scala + +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("yert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) +val data = Seq("I love spark-nlp").toDS.toDF("text") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yert_ner| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Matthew-Lund/YERT-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-yert_ner_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-yert_ner_pipeline_en.md new file mode 100644 index 00000000000000..bd76c49fb058f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-yert_ner_pipeline_en.md @@ -0,0 +1,70 @@ +--- +layout: model +title: English yert_ner_pipeline pipeline BertForTokenClassification from Matthew-Lund +author: John Snow Labs +name: yert_ner_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yert_ner_pipeline` is a English model originally trained by Matthew-Lund. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yert_ner_pipeline_en_5.5.1_3.0_1734336706596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yert_ner_pipeline_en_5.5.1_3.0_1734336706596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +pipeline = PretrainedPipeline("yert_ner_pipeline", lang = "en") +annotations = pipeline.transform(df) + +``` +```scala + +val pipeline = new PretrainedPipeline("yert_ner_pipeline", lang = "en") +val annotations = pipeline.transform(df) + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yert_ner_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/Matthew-Lund/YERT-NER + +## Included Models + +- DocumentAssembler +- TokenizerModel +- BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-your_model_name_en.md b/docs/_posts/ahmedlone127/2024-12-16-your_model_name_en.md new file mode 100644 index 00000000000000..ec0904c61950db --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-your_model_name_en.md @@ -0,0 +1,88 @@ +--- +layout: model +title: English your_model_name BertForQuestionAnswering from utkuozuak +author: John Snow Labs +name: your_model_name +date: 2024-12-16 +tags: [en, open_source, onnx, question_answering, bert] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +engine: onnx +annotator: T5Transformer +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`your_model_name` is a English model originally trained by utkuozuak. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/your_model_name_en_5.5.1_3.0_1734332345099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/your_model_name_en_5.5.1_3.0_1734332345099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("your_model_name","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([documentAssembler, spanClassifier]) +data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context") +pipelineModel = pipeline.fit(data) +pipelineDF = pipelineModel.transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering.pretrained("your_model_name", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context") +val pipelineModel = pipeline.fit(data) +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|your_model_name| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document]| +|Output Labels:|[output]| +|Language:|en| +|Size:|325.9 MB| + +## References + +References + +https://huggingface.co/utkuozuak/your_model_name \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2024-12-16-your_model_name_pipeline_en.md b/docs/_posts/ahmedlone127/2024-12-16-your_model_name_pipeline_en.md new file mode 100644 index 00000000000000..63dc4c308a97de --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-12-16-your_model_name_pipeline_en.md @@ -0,0 +1,71 @@ +--- +layout: model +title: English your_model_name_pipeline pipeline BertForQuestionAnswering from utkuozuak +author: John Snow Labs +name: your_model_name_pipeline +date: 2024-12-16 +tags: [en, open_source, pipeline, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.5.1 +spark_version: 3.0 +supported: true +annotator: PipelineModel +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`your_model_name_pipeline` is a English model originally trained by utkuozuak. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/your_model_name_pipeline_en_5.5.1_3.0_1734332366292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/your_model_name_pipeline_en_5.5.1_3.0_1734332366292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +pipeline = PretrainedPipeline("your_model_name_pipeline", lang = "en") +annotations = pipeline.transform(df) +``` +```scala +val pipeline = new PretrainedPipeline("your_model_name_pipeline", lang = "en") +val annotations = pipeline.transform(df) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|your_model_name_pipeline| +|Type:|pipeline| +|Compatibility:|Spark NLP 5.5.1+| +|License:|Open Source| +|Edition:|Official| +|Language:|en| +|Size:|325.9 MB| + +## References + +References + +https://huggingface.co/utkuozuak/your_model_name + +## Included Models + +- DocumentAssembler +- T5Transformer \ No newline at end of file