Are there any better examples for running the rest-api WITHOUT docker? The explanation given on the rest-api page is really bad. #5693

msarro · 2023-08-30T18:04:59Z

msarro
Aug 30, 2023

Context:
I am building an ingest/QA api with haystack using DPR, which works. I am attempting to add in promptnode/prompttemplate. However, when running docker, everything seems to load, until the very end when I get an error that essentially says, "something bad happened, check the stack trace", except no stack trace is generated, or if it is, it isn't included in the docker output. So I am trying to re-implement as pure python to run locally without using docker/yml.

The problem is that the documentation's total explanation for how to do this on the rest api page is:

If you prefer to not use Docker with your Haystack API, you can start the REST API server and supporting Haystack pipeline by running the gunicorn server manually with the following command:

gunicorn rest_api.application:app -b 0.0.0.0:8000 -k uvicorn.workers.UvicornWorker -t 300

The problem is, that doesn't work. I get a ton of errors. There isn't any real example of what libraries need to be imported, how to structure it, what vars/functions need to be defined, etc.

The closest I can find is code here:
https://github.com/deepset-ai/haystack/blob/main/rest_api/rest_api/application.py

which provides a few breadcrumbs, but still doesn't really work.

There is no sample code showing how the source should be structured to be compatible with instantiation this way. Google seems to point me towards a bunch of examples using flask or fastapi.

Considering how awesome most of haystack's code is, this is surprisingly devoid of detail.

Are there any better example implementations for how you would structure a python file to be instantiated this way for running as a rest api without docker?

bilgeyucel · 2023-08-31T16:46:36Z

bilgeyucel
Aug 31, 2023
Maintainer

Hi @msarro, I'm sorry to hear that. We are aware that REST API implementation is not the best. For that reason, we have a roadmap item: Shape Requirements for REST API where we collect requirements to improve the implementation.

In the meantime, to have a better guidance around the current implementation, I updated the documentation according to your feedback. Feel free to check and let me know if something is missing: REST API. If you need more info about Haystack API, you can check also this tutorial: Using Haystack with REST API. This tutorial uses Docker but explains pipeline yaml structure in details.

Regarding to your question, you don't need to change anything in a python file. What we miss in the documentation is that you needed to install rest_api module to use the gunicorn server. After installation, you can start the gunicorn server with the command above and you should be able to run this YAML file:

[Toggle] Pipeline YAML Example with PromptNode & PromptTemplate

version: '1.19.0'
 
components:
  - name: DocumentStore
    type: ElasticsearchDocumentStore
  - name: EmbeddingRetriever # Selects the most relevant documents from the document store
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search. It has been trained on 215M (question, answer) pairs from diverse sources.
      model_format: sentence_transformers
      top_k: 2 # The number of results to return
  - name: qa_template
    type: PromptTemplate
    params:
      output_parser:
        type: AnswerParser
      prompt: "Given the context please answer the question. Context: {join(documents)}; \
              Question: {query}; \
              Answer:
              "
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: qa_template
      max_length: 50 # The maximum number of tokens the generated answer can have
      model_kwargs: # Specifies additional model settings
        temperature: 0 # Lower temperature works best for fact-based qa
      model_name_or_path: google/flan-t5-base
  - name: FileTypeClassifier # Routes files based on their extension to appropriate converters, by default txt, pdf, md, docx, html
    type: FileTypeClassifier
  - name: TextConverter # Converts files into documents
    type: TextConverter
  - name: PDFConverter # Converts PDFs into documents
    type: PDFToTextConverter
  - name: Preprocessor # Splits documents into smaller ones and cleans them up
    type: PreProcessor
    params:
      # With a vector-based retriever, it's good to split your documents into smaller ones
      split_by: word # The unit by which you want to split the documents
      split_length: 250 # The max number of words in a document
      split_overlap: 20 # Enables the sliding window approach
      language: en
      split_respect_sentence_boundary: True # Retains complete sentences in split documents

# Here you define how the nodes are organized in the pipelines
# For each node, specify its input
pipelines:
  - name: query
    nodes:
      - name: EmbeddingRetriever
        inputs: [Query]
      - name: PromptNode
        inputs: [EmbeddingRetriever]
  - name: indexing
    nodes:
    # Depending on the file type, we use a Text or PDF converter
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextConverter
        inputs: [FileTypeClassifier.output_1] # Ensures that this converter receives txt files
      - name: PDFConverter
        inputs: [FileTypeClassifier.output_2] # Ensures that this converter receives PDFs
      - name: Preprocessor
        inputs: [TextConverter, PDFConverter]
      - name: EmbeddingRetriever
        inputs: [Preprocessor]
      - name: DocumentStore
        inputs: [EmbeddingRetriever]

Hope this answer helps 🙌 If you'd like to share more information about your docker setting, your pipeline yaml file and the full error you get, I'm happy to help you solving the error with Docker

3 replies

msarro Sep 1, 2023
Author

Hello, I will give this a shot, and thank you for the super fast response!

I think my goal had been to sidestep using the pipeline yaml entirely and using just python code, but this still gives me the ability to test and see if the docker layer was causing the issue. I'll give it a shot and report back!

bilgeyucel Sep 1, 2023
Maintainer

The rest_api module is not designed to be used with python code. I haven't tried this workaround but the module should work after replacing query_pipeline and index_pipeline values in __init__.py with your pipeline.

Alternatively, although this is not exactly what you're looking for, there's a save_to_yaml() function to generate a yaml file from a pipeline. It might be useful for you and I think this approach is cleaner than the first one

msarro Sep 11, 2023
Author

Based on your suggestions I revisited the pipeline/yml approach and did have some success using promptnode with a local transformer. However I still seem to have issues where promptnode fails to initialize when using the container based approach when trying to use azure openAI.

This is the docker log:

haystack-api     | 09/11/2023 03:05:03 PM Using devices: CUDA:0 - Number of GPUs: 1
haystack-api     | 09/11/2023 03:05:03 PM Init retriever using embeddings of model sentence-transformers/multi-qa-mpnet-base-dot-v1
haystack-api     | 09/11/2023 03:05:04 PM Error loading query pipeline from /opt/pipelines/test-pipelines.haystack-pipeline.yml.
haystack-api     |  Failed loading pipeline component 'PromptNode'. See the stacktrace above for more information.
haystack-api     |
haystack-api     |
haystack-api     | pdftotext version 4.04 [www.xpdfreader.com]
haystack-api     | Copyright 1996-2022 Glyph & Cog, LLC
haystack-api     | 09/11/2023 03:05:04 PM Using devices: CUDA:0 - Number of GPUs: 1
haystack-api     | 09/11/2023 03:05:04 PM Init retriever using embeddings of model sentence-transformers/multi-qa-mpnet-base-dot-v1
haystack-api     | INFO:     Started server process [1]
haystack-api     | INFO:     Waiting for application startup.
haystack-api     | INFO:     Application startup complete.
haystack-api     | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

So, some error is generated, but it isn't included in the output of the docker log, and no stack trace is output to diagnose.

My specific promptnode looks like this atm:

version: ignore

components:    # define all the building-blocks for Pipeline
  - name: OpenSearchDocumentStore
    type: OpenSearchDocumentStore
    params:
      host: opensearch-node
      port: 9200
      scheme: http
      similarity: "dot_product"
      embedding_dim: 768
  - name: Retriever # Selects the most relevant documents from the document store
    type: EmbeddingRetriever # Uses a Transformer model to encode the document and the query
    params:
      document_store: OpenSearchDocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1
      model_format: sentence_transformers
      top_k: 2 # The number of results to return
      max_seq_len: 1536
      batch_size: 16
      use_gpu: true
  - name: qa_template
    type: PromptTemplate
    params:
      output_parser:
        type: AnswerParser
      prompt: "Given the context please answer the question. Context: {join(documents)}; Question: {query}; Answer:"
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: qa_template
      model_name_or_path: gpt-3.5-turbo
      api_key: ourapikey
      max_length: 512
      model_kwargs:
        api_version: 2023-05-15
        azure_base_url: https://oursubdomain.openai.azure.com
        azure_deployment_name: ourdeploymentname
  - name: TextFileConverter
    type: TextConverter
    params:
      remove_numeric_tables: True
      valid_languages: ["en"]
  - name: PDFFileConverter
    type: PDFToTextConverter
    params:
      remove_numeric_tables: True
      valid_languages: ["en"]
  - name: MarkdownFileConverter
    type: MarkdownConverter
    params:
      remove_numeric_tables: True
      valid_languages: ["en"]
  - name: Preprocessor
    type: PreProcessor
    params:
      split_by: word
      split_length: 250
      split_overlap: 20
      language: en
  - name: FileTypeClassifier
    type: FileTypeClassifier

pipelines:
  - name: query
    nodes:
      - name: Retriever
        inputs: [Query]
      - name: PromptNode
        inputs: [Retriever]
  - name: indexing
    nodes:
      - name: FileTypeClassifier
        inputs: [File]
      - name: TextFileConverter
        inputs: ["FileTypeClassifier.output_1"]
      - name: PDFFileConverter
        inputs: ["FileTypeClassifier.output_2"]
      - name: MarkdownFileConverter
        inputs: ["FileTypeClassifier.output_3"]
      - name: Preprocessor
        inputs: [PDFFileConverter, TextFileConverter, MarkdownFileConverter]
      - name: Retriever
        inputs: [Preprocessor]
      - name: OpenSearchDocumentStore
        inputs: [Retriever]

I found this PR which suggests that model_kwargs may have been changed to invocation_kwargs but it seems to fail hard, so I assume the change was never fully implemented.

I've also tried creating a PromptModel and referencing that with identical behavior. I feel like i'm close and it's just something silly. I just need to get some kind of actionable output to troubleshoot and the container is hiding it. That's why I had originally wanted to sidestep the container entirely and see if there is some other library that is spitting out meaningful info that isn't being captured in the docker log.

  - name: PromptModel
    type: PromptModel
    params:
      model_name_or_path: gpt-3.5-turbo
      api_key: ourapikey
      max_length: 512
      model_kwargs:
        api_version: 2023-05-15
        azure_base_url: https://oursubdomain.openai.azure.com
        azure_deployment_name: ourdeploymentname
  - name: PromptNode
    type: PromptNode
    params:
      default_prompt_template: qa_template
      model_name_or_path: PromptModel

w1gs · 2023-09-09T21:17:03Z

w1gs
Sep 9, 2023

Hi @msarro,

I've also run into some issues with the API before. Here's a few things you can try that have helped me. First if you can get just the elasticsearch docker running (docker compose up elasticsearch) the following 2 should work.

When you run the original command to start the server, do it within the rest_api folder of the project.
The rest_api.application:app part of the code is telling gunicorn to look in rest_api/application.py for the app object. If this works, it's most likely the Docker layer giving you issues. Another thing you can try is just removing the workers so the command will just be gunicorn rest_api.application:app -b 0.0.0.0:8000
The second thing you can try is just running it with uvicorn directly from within the rest_api folder:
uvicorn rest_api.application:app --host 0.0.0.0 --port 8000 --reload

If you cannot get those working, here's a very basic and stripped down implementation to get you started with setting up your own API...
Note: This is a very basic implementation that should only be used as an example to understand the API more. This is just one implementation and should not be considered the only or "correct" way. The Haystack API has some good base features which this example lacks that you would want in a production system such as health checks.

Project Structure

- demo_api/
	- __init__.py # This is just a blank file. This lets fastapi know that this folder is a module.
	- application.py # Core stuff needed to get the API setup and running.
	- config.py # This file will usually hold configuration settings for the api (think model names, api keys, model kwargs, docstore index names, etc)
	- models.py (or schema.py) # Would usually contain the different data structures you would need for your project.
- internal/
	- __init__.py
	- askQuestion.py # Contains the functions using Haystack that will do the backend processing

I would also usually throw in a routers folder that defines all the routes but for this example I will just define the endpoints in the main application file. Technically the only two files required here are the first two. Nothing is stopping you from defining everything in one huge file but that would just be madness...

application.py

from fastapi import FastAPI
from .internal.askQuestion import ask_llm
from fastapi.middleware.cors import CORSMiddleware
from .models import Question

# Define the app
app = FastAPI()

# Prevent CORS errors
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Example health endpoint to determine if API is accessible
@app.get("/api/health")
async def health():
    return {"status": "ok"}

# The endpoint that user's questions will be submitted to via POST.
@app.post("/api/ask")
# Question here is the defined data struct thats expected. FastAPI will return an error if the request doesnt match.
async def ask(question: Question):
    # Pass the question to function that will be using Haystack to answer it.
    response = ask_llm(question)
    return {"answer": [response]}

config.py

from pydantic import BaseSettings
from torch.cuda import is_available
from haystack.nodes import EmbeddingRetriever
from haystack.document_stores import FAISSDocumentStore


# Define settings
class Settings(BaseSettings):
    SQL_URL: str = "sqlite:///api_document_store.db"
    EMBEDDING_MODEL: str = "sentence-transformers/multi-qa-mpnet-base-cos-v1"
    DIMENSIONS: int = 768
    SIMILARITY: str = "cosine"
    MODEL_FORMAT: str = "sentence_transformers"
    LLM_MODEL: str = "gpt-4"
    GPT_ARGS: dict = {"temperature": 0.2}


CONFIG = Settings()

# Determine if were running on a machine with a GPU
ISGPU = is_available()

# Define doc store using settings
DOCUMENT_STORE = FAISSDocumentStore(
    faiss_index_factory_str="Flat",
    sql_url=CONFIG.SQL_URL,
    similarity=CONFIG.SIMILARITY,
    embedding_dim=CONFIG.DIMENSIONS,
)

# Initialize retriever
RETRIEVER = EmbeddingRetriever(
    document_store=DOCUMENT_STORE,
    embedding_model=CONFIG.EMBEDDING_MODEL,
    model_format=CONFIG.MODEL_FORMAT,
    use_gpu=ISGPU,
)

models.py

from pydantic import BaseModel


# Defines the Question struct we will be expecting.
# prompt must be a str and it is not optional
# username should be a string, but if None is passed set it to "Default User"
class Question(BaseModel):
    prompt: str
    username: str | None = "Default User"

askQuestion.py

from haystack import Pipeline
from haystack.nodes import PromptNode, PromptTemplate
from ..config import CONFIG, RETRIEVER
from ..models import Question


def ask_llm(question: Question):
    LFQA_PROMPT = PromptTemplate(
        prompt="Ansewr the following question based on the following documents: {question.prompt}....."
    )

    node = PromptNode(
        model_name_or_path=CONFIG.LLM_MODEL,
        default_prompt_template=LFQA_PROMPT,
        api_key=CONFIG.OPENAI_API_KEY,
        max_length=275,
        model_kwargs=CONFIG.GPT_ARGS,
    )

    pipe = Pipeline()
    pipe.add_node(component=RETRIEVER, name="retriever", inputs=["Query"])
    pipe.add_node(component=node, name="prompt_node", inputs=["retriever"])
    response = pipe.run(query=question.prompt)
    return response["results"][0]

The server can then be started from the the parent directory that contains the demo_api folder in it with:
uvicorn api_demo.application:app --host 0.0.0.0 --port 8080 --reload

Hope this helps! If you run into any issues or need help I'd be happy to assist in anyway I can.

2 replies

msarro Sep 11, 2023
Author

This is extremely helpful, thank you! I had gotten so far as starting to build out with fastAPI on my own, but yeah there's a ton of QOL stuff that Haystack includes by default that I really didn't want to lose. I managed to make a bit of progress getting promptnode to work with a local transformer, but openai still gives cryptic errors referring to a nonexistant stack trace.

I may go this route if I can't get unstuck in the next day or two! I really appreciate your time. This code looks like what I needed!

w1gs Sep 11, 2023

No problem! Would be happy to answer any other questions that might come up. Also on the OpenAI stuff.... Does the API key belong to a company account? If it does you may need to add in the organization Id to the requests. I remember running into an issue with that and had no context as to what was going on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are there any better examples for running the rest-api WITHOUT docker? The explanation given on the rest-api page is really bad. #5693

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Are there any better examples for running the rest-api WITHOUT docker? The explanation given on the rest-api page is really bad. #5693

msarro Aug 30, 2023

Replies: 2 comments · 5 replies

bilgeyucel Aug 31, 2023 Maintainer

msarro Sep 1, 2023 Author

bilgeyucel Sep 1, 2023 Maintainer

msarro Sep 11, 2023 Author

w1gs Sep 9, 2023

msarro Sep 11, 2023 Author

w1gs Sep 11, 2023

msarro
Aug 30, 2023

Replies: 2 comments 5 replies

bilgeyucel
Aug 31, 2023
Maintainer

msarro Sep 1, 2023
Author

bilgeyucel Sep 1, 2023
Maintainer

msarro Sep 11, 2023
Author

w1gs
Sep 9, 2023

msarro Sep 11, 2023
Author