Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added example for Legal Simplifier #141

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions examples/legalsimplifier/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Legal Simplifier

Legal Simplifier is an application that allows you to simplify legal documents - from terms and conditions of an insrance policy or a business contract. This example also shows how we can summarize contents of a large documents in chunks.

## Design

The script consists of three tools: a top-level tool that orchestrates everything, a summarizer that
will summarize one chunk of text at a time, and a Python script that ingests the PDF and splits it into
chunks and provides a specific chunk based on an index.

The summarizer tool looks at the entire summary up to the current chunk and then summarizes the current
chunk and adds it onto the end. In the case of models with very small context windows, or extremely large
documents, this approach may still exceed the context window, in which case another tool could be added to
only give the summarizer the previous few chunk summaries instead of all of them.

Based on the document you upload, the size can vary and hence mgight find the need to split larger documents into chunks of 10,000 tokens to fit within GTP-4's context window.

## Installation

### Prerequisites

- Python 3.8 or later
- Flask
- Other Python dependencies listed in `requirements.txt`.

### Steps

1. Clone the repository:

``` bash
git clone https://github.com/gptscript-ai/gptscript.git
```

2. Navigate to the `examples/legalsimplifier` directory and install the dependencies:

Python:

```bash
pip install -r requirements.txt
```

Node:

```bash
npm install
```

3. Setup `OPENAI_API_KEY` (Eg: `export OPENAI_API_KEY="yourapikey123456"`). You can get your [API key here](https://platform.openai.com/api-keys).

4. Run the Flask application using `flask run` or `python app.py`

## Usage

1. Open your web browser and navigate to `http://127.0.0.1:5000/`.
2. Use the web interface to upload an a legal document in .pdf format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Use the web interface to upload an a legal document in .pdf format.
2. Use the web interface to upload a legal document in .pdf format.

3. The application will analyze the document and show a summary.

## Under the hood

Below are the processes that take place when you execute the application:

- The Python app writes the uploaded document as `legal.pdf` in the current working directory.
- It then executes `legalsimplifier.gpt` which internally calls `main.py` to split the large document in chunks so that they fit within the token limit of GPT-4's context.
- The analysis will be stored in a `summary.md` document.
- The app will then read this summary file and show the summary on the UI.

Example Summary
```md
### Summary

- **Proposal**: When someone shows their willingness to do or not do something to get agreement from another.
- **Promise**: A proposal that has been agreed upon.
- **Promisor and Promisee**: The one making the proposal and the one accepting it, respectively.
- **Consideration**: Something done or not done, or promised to be done or not done, which forms the reason for a party's agreement to a promise.
- **Agreement**: A set of promises forming the reason for each other's agreement.
- **Contract**: An agreement enforceable by law.
- **Voidable Contract**: A contract that can be enforced or nullified at the option of one or more parties.
- **Void Agreement**: An agreement not enforceable by law.

- The document begins by defining key terms related to contracts, such as proposal, promise, consideration, agreement, contract, voidable contract, and void agreement.
- It outlines the process of making a proposal, accepting it, and the conditions under which a proposal or acceptance can be revoked.
- It emphasizes that for a proposal to turn into a promise, the acceptance must be absolute and unqualified.
- The document also explains that agreements become contracts when they are made with free consent, for a lawful consideration and object, and are not declared void.
- Competence to contract is defined by age, sound mind, and not being disqualified by any law.
- The document details what constitutes free consent, including the absence of coercion, undue influence, fraud, misrepresentation, or mistake.
- It specifies conditions under which agreements are void, such as when both parties are under a mistake regarding a fact essential to the agreement.

```
49 changes: 49 additions & 0 deletions examples/legalsimplifier/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
from flask import Flask, jsonify, render_template, request
import subprocess
import os

app = Flask(__name__)

# Setting the base directory
base_dir = os.path.dirname(os.path.abspath(__file__))
app.config['UPLOAD_FOLDER'] = base_dir

SCRIPT_PATH = os.path.join(base_dir, 'legalsimplifier.gpt')
LEGAL_FILE_NAME = 'legal.pdf' # Uploaded document name
SUMMARY_FILE_NAME = 'summary.md' # The output file name

@app.route('/')
def index():
return render_template('index.html')

@app.route('/upload', methods=['POST'])
def upload_file():
if 'file' not in request.files:
return jsonify({'error': 'No file part'}), 400
file = request.files['file']
if file.filename == '':
return jsonify({'error': 'No selected file'}), 400
if file:
# Process the file here to generate the summary
filename = os.path.join(app.config['UPLOAD_FOLDER'], LEGAL_FILE_NAME)
file.save(filename)
summary = process_file(file)
return jsonify({'summary': summary})

def process_file(file):
try:
# Execute the script to generate the summary
subprocess.run(f"gptscript {SCRIPT_PATH}", shell=True, check=True)

# Read summary.md file
summary_file_path = os.path.join(app.config['UPLOAD_FOLDER'], SUMMARY_FILE_NAME)
with open(summary_file_path, 'r') as summary_file:
summary = summary_file.read()

# Return summary content
return summary
except Exception as e:
return jsonify({'error': str(e)}), 500

if __name__ == '__main__':
app.run(debug=False)
Binary file added examples/legalsimplifier/legal.pdf
Binary file not shown.
38 changes: 38 additions & 0 deletions examples/legalsimplifier/legalsimplifier.gpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
tools: legal-simplifier, sys.read, sys.write

You are a program that is tasked with analyizing a legal document and creating a summary of it.
Create a new file "summary.md" if it doesn't already exist.
Call the legal-simplifier tool to get each part of the document. Begin with index 0.
Do not proceed until the tool has responded to you.
Once you get "No more content" from the legal-simplifier stop calling it.
Then, print the contents of the summary.md file.

---
name: legal-simplifier
tools: doc-retriever, sys.read, sys.append
description: Summarizes a legal document
techmaharaj marked this conversation as resolved.
Show resolved Hide resolved
args: index: (unsigned int) the index of the portion to summarize, beginning at 0

As a legal expert and practicing lawyer, you are tasked with analyzing the provided legal document.
Your deep understanding of the law equips you to simplify and summarize the document effectively.
Your goal is to make the document more accessible for non-experts, leveraging your expertise to provide clear and concise insights.

Get the part of legal document at index $index.
Do not leave out any important points. Focus on key points, implications, and any notable clauses or provisions.
Do not leave out any important points focusing on key points, implications, and any notable clauses or provisions.
techmaharaj marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be removed

Give a list of all the terms and explain them in one line before writing the summary in the document.
Give a list of all the terms and explain them in one liner before writing the summary in the document.
techmaharaj marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

For each summary write in smaller chunks or add bullet points if required to make it easy to understand.
Use the heading "Summary" only once in the entire document.
Explain terms in simple language and avoid legal jargon unless absolutely necessary.
Explain terms in simple language and avoid legal terminologies until unless absolutely necessary.
techmaharaj marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

Add two newlines to the end of your summary and append it to summary.md.

If you got "No more content" just say "No more content". Otherwise, say "Continue".

---
name: doc-retriever
description: Returns a part of the text of legal document. Returns "No more content" if the index is greater than the number of parts.
args: index: (unsigned int) the index of the part to return, beginning at 0

#!python3 main.py "$index"
28 changes: 28 additions & 0 deletions examples/legalsimplifier/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import tiktoken
import sys
from llama_index.readers.file import PyMuPDFReader
from llama_index.core.node_parser import TokenTextSplitter

# Loading the legal document using PyMuPDFReader
index = int(sys.argv[1])
docs = PyMuPDFReader().load("legal.pdf")

# Combining text content from all documents into a single string
combined = ""
for doc in docs:
combined += doc.text

# Initializing a TokenTextSplitter object with specified parameters
splitter = TokenTextSplitter(
chunk_size=10000,
chunk_overlap=10,
tokenizer=tiktoken.encoding_for_model("gpt-4-turbo-preview").encode)

pieces = splitter.split_text(combined)

# Checking if the specified index is valid
if index >= len(pieces):
print("No more content")
sys.exit(0)

print(pieces[index])
4 changes: 4 additions & 0 deletions examples/legalsimplifier/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Flask==2.0.1
tiktoken==0.6.0
llama-index-core==0.10.14
llama-index-readers-file==0.1.6
54 changes: 54 additions & 0 deletions examples/legalsimplifier/static/css/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
body {
padding-top: 20px;
font-family: 'Roboto', sans-serif;
background-color: #ffffff;
}

.navbar {
margin-bottom: 20px;
background-color: #009688; /* Teal color */
}

.navbar-brand, .nav-link {
color: #fff !important;
}

.container-fluid {
max-width: 1200px; /* Adjust based on your preference */
}

.row {
margin: 0;
}

.col-md-6 {
width: 50%;
padding: 20px;
box-sizing: border-box;
}

.form-control, .btn, .custom-file-label {
border-radius: 0; /* Material design doesn't use rounded corners for inputs/buttons */
}

/* Simplified content styling */
#simplified-content {
background-color: #fff;
padding: 20px;
border-radius: 5px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
}

.loader {
display: none;
border: 4px solid #f3f3f3;
border-top: 4px solid #3498db;
border-radius: 50%;
width: 30px;
height: 30px;
animation: spin 2s linear infinite;
}
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
56 changes: 56 additions & 0 deletions examples/legalsimplifier/static/js/script.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
document.addEventListener('DOMContentLoaded', function() {

const randomMessages = [
"Brewing up some simplicity.",
"Decoding legalese.",
"Simplifying complex texts.",
"Turning the complicated into the understandable.",
"Working our magic on your document."
];

// Define uploadFile globally
window.uploadFile = function() {
var form = document.getElementById('uploadForm');
var formData = new FormData(form);
var summaryBlock = document.getElementById('summaryBlock');
var summaryOutput = document.getElementById('documentSummary');

// Display a random message
var messageDiv = document.getElementById('randomMessage');
messageDiv.innerHTML = randomMessages[Math.floor(Math.random() * randomMessages.length)]; // Display initial random message
var messageInterval = setInterval(function() {
messageDiv.innerHTML = randomMessages[Math.floor(Math.random() * randomMessages.length)];
}, 5000); // Change message every 5 seconds

fetch('/upload', {
method: 'POST',
body: formData,
})
.then(response => response.json()) // Parse the JSON response
.then(data => {
if(data.summary) {
console.log(data.summary)
var converter = new showdown.Converter()
var parsedHtml = converter.makeHtml(data.summary);
summaryOutput.innerHTML = parsedHtml; // Display the recipe
summaryBlock.style.display='block'
messageDiv.style.display = 'none' // Clear message

// Scroll to the documentSummary div
document.getElementById('documentSummary').scrollIntoView({
behavior: 'smooth', // Smooth scroll
block: 'start' // Align to the top of the view
});

} else if (data.error) {
summaryOutput.innerHTML = `<p>Error: ${data.error}</p>`;
messageDiv.style.display = 'none' // Clear message
}
})
.catch(error => {
console.error('Error:', error);
summaryOutput.innerHTML = `<p>Error: ${error}</p>`;
messageDiv.style.display = 'none' // Clear message
});
};
});
65 changes: 65 additions & 0 deletions examples/legalsimplifier/templates/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Legal Simplifier</title>
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-T3c6CoIi6uLrA9TneNEoa7RxnatzjcDSCmG1MXxSR1GAsXEV/Dwwykc2MPK8M2HN" crossorigin="anonymous">
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/showdown.min.js"></script>
<link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet">
</head>
<body>
<nav class="navbar navbar-expand-lg navbar-dark bg-dark">
<div class="container">
<a class="navbar-brand" href="#">Legal Simplifier</a>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav me-auto mb-2 mb-lg-0">
<li class="nav-item">
<a class="nav-link active" aria-current="page" href="#">Home</a>
</li>
<li class="nav-item">
<a class="nav-link" href="https://gptscript.ai">GPTScript</a>
</li>
</ul>
</div>
</div>
</nav>

<div class="container col-xl-10 col-xxl-8 px-4 py-5">
<div class="row align-items-center g-lg-5 py-5">
<div class="col-lg-4 text-center text-lg-start">
<h3 class="display-6 fw-bold lh-3 mb-4">Legal Document Simplifier</h3>
<p class="fs-5">Upload your legal documents in PDF format, and let our tool simplify the content into easy-to-understand text. This simplification aims to make legal jargon accessible to everyone.</p>
</div>
<div class="col-lg-8 mx-auto">
<form id="uploadForm" class="p-4 p-md-5 border rounded-3 bg-light" enctype="multipart/form-data">
<input type="file" name="file" class="form-control" id="formFile" aria-describedby="inputGroupFileAddon04" aria-label="Upload">
<button class="w-100 btn btn-lg btn-primary" style="margin-top: 15px;" type="button" id="inputGroupFileAddon04" onclick="uploadFile()">Simplify It</button>
<div id="randomMessage" style="margin-top: 10px;" class="mt-3"></div>
</form>
</div>
</div>
</div>

<hr class="my-4">
<div class="container col-xl-10 col-xxl-8 px-4 py-5" id="summaryBlock" style="display: none;">
<div class="row">
<div class="col-12">
<h2 class="display-6" style="text-align: center;">Summary</h2>
<div id="documentSummary" class="border rounded-3 p-4 bg-light">
<!-- The summarized document will be displayed here -->
</div>
</div>
</div>
</div>


<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@popperjs/[email protected]/dist/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/5.0.0-alpha1/js/bootstrap.min.js"></script>
<script src="{{ url_for('static', filename='js/script.js') }}"></script>
</body>
</html>