Skip to content

Latest commit

 

History

History
64 lines (36 loc) · 3.43 KB

README.md

File metadata and controls

64 lines (36 loc) · 3.43 KB

CODESTAR - CODE Surfing Tool for Answer Retrieval.

CODESTAR is an innovative codebase chatbot designed to facilitate seamless interaction with your code. It caters to various user queries, whether related to code functionality, code explanation, or obtaining a high-level overview of the entire project. The tool harnesses the capabilities of the Mistral-7B Language Model (LLM) and employs a user-friendly chat interface created with Streamlit.

Key Features

  1. Automated Github Repository Processing:
    Users just have to enter the name of their GitHub repository. The tool automatically clones the specified repository, divides it into manageable chunks, and embeds them for efficient interaction. Note that the repository should be public.

  2. Langchain Integration and Chat Interface:
    The powerful Langchain tool is employed to construct a Question-Answer (QA) retriever. The chat interface provides a user-friendly environment where users can ask questions, have interactive sessions with their codebase, and seek answers to their queries.

  3. Source and Metadata Retrieval (Bonus Feature):
    Apart from the response generated by the chatbot, the users are also provided with source code files based on which the response was generated. This allows the users to verify and learn more by delving deeper into the returned source files. This makes the responses more reliable.

  4. Code Functionality Exploration:
    Users can inquire about the functioning of specific portions of the code, seek explanations, and obtain a comprehensive understanding of the project.

Usage

Before you proceed, this project uses CUDA and thus ensure you have the appropriate CUDA drivers installed in your machine from here.

To use this codebase chatbot, follow these steps:

  1. Clone the repository:

git clone https://github.com/adismort14/CODESTAR.git

  1. Install the required dependencies:

pip install -r requirements.txt

- If Pytorch CUDA is not installed in the system: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

- Ensure that you have CUDA drivers installed. The following command should run fine: `nvcc --version`
  1. Run the Streamlit app:

streamlit run gui.py

Access the chat interface by opening your web browser and navigating to http://localhost:8501.

Enter the name of your GitHub repository in the provided input fields.

The codebase will be chunked and embedded, and the chat interface will be displayed.

Ask questions or provide instructions using natural language, and the chatbot will respond accordingly.

Demo Video

Short Demo of the App

Screenshots

Demo Pic 1

Demo Pic 2

Limitations and Considerations

  • The functionality of the codebase chatbot is contingent upon the Mistral 7B Language Model and its inherent capabilities.
  • While opting for a more potent Language Model (LLM) is possible, it comes at the cost of increased resource utilization.
  • Processing large codebases or repositories with intricate structures may result in longer chunking and embedding times.
  • The accuracy and quality of responses are intricately tied to the precision of the language model and the effectiveness of code embeddings.