-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added example for Legal Simplifier #141
base: main
Are you sure you want to change the base?
Changes from all commits
f9f603c
f0f9f04
beed68f
feabe7f
d6c06c2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# Legal Simplifier | ||
|
||
Legal Simplifier is an application that allows you to simplify legal documents - from terms and conditions of an insrance policy or a business contract. This example also shows how we can summarize contents of a large documents in chunks. | ||
|
||
## Design | ||
|
||
The script consists of three tools: a top-level tool that orchestrates everything, a summarizer that | ||
will summarize one chunk of text at a time, and a Python script that ingests the PDF and splits it into | ||
chunks and provides a specific chunk based on an index. | ||
|
||
The summarizer tool looks at the entire summary up to the current chunk and then summarizes the current | ||
chunk and adds it onto the end. In the case of models with very small context windows, or extremely large | ||
documents, this approach may still exceed the context window, in which case another tool could be added to | ||
only give the summarizer the previous few chunk summaries instead of all of them. | ||
|
||
Based on the document you upload, the size can vary and hence mgight find the need to split larger documents into chunks of 10,000 tokens to fit within GTP-4's context window. | ||
|
||
## Installation | ||
|
||
### Prerequisites | ||
|
||
- Python 3.8 or later | ||
- Flask | ||
- Other Python dependencies listed in `requirements.txt`. | ||
|
||
### Steps | ||
|
||
1. Clone the repository: | ||
|
||
``` bash | ||
git clone https://github.com/gptscript-ai/gptscript.git | ||
``` | ||
|
||
2. Navigate to the `examples/legalsimplifier` directory and install the dependencies: | ||
|
||
Python: | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
Node: | ||
|
||
```bash | ||
npm install | ||
``` | ||
|
||
3. Setup `OPENAI_API_KEY` (Eg: `export OPENAI_API_KEY="yourapikey123456"`). You can get your [API key here](https://platform.openai.com/api-keys). | ||
|
||
4. Run the Flask application using `flask run` or `python app.py` | ||
|
||
## Usage | ||
|
||
1. Open your web browser and navigate to `http://127.0.0.1:5000/`. | ||
2. Use the web interface to upload an a legal document in .pdf format. | ||
3. The application will analyze the document and show a summary. | ||
|
||
## Under the hood | ||
|
||
Below are the processes that take place when you execute the application: | ||
|
||
- The Python app writes the uploaded document as `legal.pdf` in the current working directory. | ||
- It then executes `legalsimplifier.gpt` which internally calls `main.py` to split the large document in chunks so that they fit within the token limit of GPT-4's context. | ||
- The analysis will be stored in a `summary.md` document. | ||
- The app will then read this summary file and show the summary on the UI. | ||
|
||
Example Summary | ||
```md | ||
### Summary | ||
|
||
- **Proposal**: When someone shows their willingness to do or not do something to get agreement from another. | ||
- **Promise**: A proposal that has been agreed upon. | ||
- **Promisor and Promisee**: The one making the proposal and the one accepting it, respectively. | ||
- **Consideration**: Something done or not done, or promised to be done or not done, which forms the reason for a party's agreement to a promise. | ||
- **Agreement**: A set of promises forming the reason for each other's agreement. | ||
- **Contract**: An agreement enforceable by law. | ||
- **Voidable Contract**: A contract that can be enforced or nullified at the option of one or more parties. | ||
- **Void Agreement**: An agreement not enforceable by law. | ||
|
||
- The document begins by defining key terms related to contracts, such as proposal, promise, consideration, agreement, contract, voidable contract, and void agreement. | ||
- It outlines the process of making a proposal, accepting it, and the conditions under which a proposal or acceptance can be revoked. | ||
- It emphasizes that for a proposal to turn into a promise, the acceptance must be absolute and unqualified. | ||
- The document also explains that agreements become contracts when they are made with free consent, for a lawful consideration and object, and are not declared void. | ||
- Competence to contract is defined by age, sound mind, and not being disqualified by any law. | ||
- The document details what constitutes free consent, including the absence of coercion, undue influence, fraud, misrepresentation, or mistake. | ||
- It specifies conditions under which agreements are void, such as when both parties are under a mistake regarding a fact essential to the agreement. | ||
|
||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
from flask import Flask, jsonify, render_template, request | ||
import subprocess | ||
import os | ||
|
||
app = Flask(__name__) | ||
|
||
# Setting the base directory | ||
base_dir = os.path.dirname(os.path.abspath(__file__)) | ||
app.config['UPLOAD_FOLDER'] = base_dir | ||
|
||
SCRIPT_PATH = os.path.join(base_dir, 'legalsimplifier.gpt') | ||
LEGAL_FILE_NAME = 'legal.pdf' # Uploaded document name | ||
SUMMARY_FILE_NAME = 'summary.md' # The output file name | ||
|
||
@app.route('/') | ||
def index(): | ||
return render_template('index.html') | ||
|
||
@app.route('/upload', methods=['POST']) | ||
def upload_file(): | ||
if 'file' not in request.files: | ||
return jsonify({'error': 'No file part'}), 400 | ||
file = request.files['file'] | ||
if file.filename == '': | ||
return jsonify({'error': 'No selected file'}), 400 | ||
if file: | ||
# Process the file here to generate the summary | ||
filename = os.path.join(app.config['UPLOAD_FOLDER'], LEGAL_FILE_NAME) | ||
file.save(filename) | ||
summary = process_file(file) | ||
return jsonify({'summary': summary}) | ||
|
||
def process_file(file): | ||
try: | ||
# Execute the script to generate the summary | ||
subprocess.run(f"gptscript {SCRIPT_PATH}", shell=True, check=True) | ||
|
||
# Read summary.md file | ||
summary_file_path = os.path.join(app.config['UPLOAD_FOLDER'], SUMMARY_FILE_NAME) | ||
with open(summary_file_path, 'r') as summary_file: | ||
summary = summary_file.read() | ||
|
||
# Return summary content | ||
return summary | ||
except Exception as e: | ||
return jsonify({'error': str(e)}), 500 | ||
|
||
if __name__ == '__main__': | ||
app.run(debug=False) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
tools: legal-simplifier, sys.read, sys.write | ||
|
||
You are a program that is tasked with analyizing a legal document and creating a summary of it. | ||
Create a new file "summary.md" if it doesn't already exist. | ||
Call the legal-simplifier tool to get each part of the document. Begin with index 0. | ||
Do not proceed until the tool has responded to you. | ||
Once you get "No more content" from the legal-simplifier stop calling it. | ||
Then, print the contents of the summary.md file. | ||
|
||
--- | ||
name: legal-simplifier | ||
tools: doc-retriever, sys.read, sys.append | ||
description: Summarizes a legal document | ||
techmaharaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
args: index: (unsigned int) the index of the portion to summarize, beginning at 0 | ||
|
||
As a legal expert and practicing lawyer, you are tasked with analyzing the provided legal document. | ||
Your deep understanding of the law equips you to simplify and summarize the document effectively. | ||
Your goal is to make the document more accessible for non-experts, leveraging your expertise to provide clear and concise insights. | ||
|
||
Get the part of legal document at index $index. | ||
Do not leave out any important points. Focus on key points, implications, and any notable clauses or provisions. | ||
Do not leave out any important points focusing on key points, implications, and any notable clauses or provisions. | ||
techmaharaj marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line should be removed |
||
Give a list of all the terms and explain them in one line before writing the summary in the document. | ||
Give a list of all the terms and explain them in one liner before writing the summary in the document. | ||
techmaharaj marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here |
||
For each summary write in smaller chunks or add bullet points if required to make it easy to understand. | ||
Use the heading "Summary" only once in the entire document. | ||
Explain terms in simple language and avoid legal jargon unless absolutely necessary. | ||
Explain terms in simple language and avoid legal terminologies until unless absolutely necessary. | ||
techmaharaj marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And here |
||
Add two newlines to the end of your summary and append it to summary.md. | ||
|
||
If you got "No more content" just say "No more content". Otherwise, say "Continue". | ||
|
||
--- | ||
name: doc-retriever | ||
description: Returns a part of the text of legal document. Returns "No more content" if the index is greater than the number of parts. | ||
args: index: (unsigned int) the index of the part to return, beginning at 0 | ||
|
||
#!python3 main.py "$index" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
import tiktoken | ||
import sys | ||
from llama_index.readers.file import PyMuPDFReader | ||
from llama_index.core.node_parser import TokenTextSplitter | ||
|
||
# Loading the legal document using PyMuPDFReader | ||
index = int(sys.argv[1]) | ||
docs = PyMuPDFReader().load("legal.pdf") | ||
|
||
# Combining text content from all documents into a single string | ||
combined = "" | ||
for doc in docs: | ||
combined += doc.text | ||
|
||
# Initializing a TokenTextSplitter object with specified parameters | ||
splitter = TokenTextSplitter( | ||
chunk_size=10000, | ||
chunk_overlap=10, | ||
tokenizer=tiktoken.encoding_for_model("gpt-4-turbo-preview").encode) | ||
|
||
pieces = splitter.split_text(combined) | ||
|
||
# Checking if the specified index is valid | ||
if index >= len(pieces): | ||
print("No more content") | ||
sys.exit(0) | ||
|
||
print(pieces[index]) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Flask==2.0.1 | ||
tiktoken==0.6.0 | ||
llama-index-core==0.10.14 | ||
llama-index-readers-file==0.1.6 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
body { | ||
padding-top: 20px; | ||
font-family: 'Roboto', sans-serif; | ||
background-color: #ffffff; | ||
} | ||
|
||
.navbar { | ||
margin-bottom: 20px; | ||
background-color: #009688; /* Teal color */ | ||
} | ||
|
||
.navbar-brand, .nav-link { | ||
color: #fff !important; | ||
} | ||
|
||
.container-fluid { | ||
max-width: 1200px; /* Adjust based on your preference */ | ||
} | ||
|
||
.row { | ||
margin: 0; | ||
} | ||
|
||
.col-md-6 { | ||
width: 50%; | ||
padding: 20px; | ||
box-sizing: border-box; | ||
} | ||
|
||
.form-control, .btn, .custom-file-label { | ||
border-radius: 0; /* Material design doesn't use rounded corners for inputs/buttons */ | ||
} | ||
|
||
/* Simplified content styling */ | ||
#simplified-content { | ||
background-color: #fff; | ||
padding: 20px; | ||
border-radius: 5px; | ||
box-shadow: 0 2px 4px rgba(0,0,0,0.1); | ||
} | ||
|
||
.loader { | ||
display: none; | ||
border: 4px solid #f3f3f3; | ||
border-top: 4px solid #3498db; | ||
border-radius: 50%; | ||
width: 30px; | ||
height: 30px; | ||
animation: spin 2s linear infinite; | ||
} | ||
@keyframes spin { | ||
0% { transform: rotate(0deg); } | ||
100% { transform: rotate(360deg); } | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
document.addEventListener('DOMContentLoaded', function() { | ||
|
||
const randomMessages = [ | ||
"Brewing up some simplicity.", | ||
"Decoding legalese.", | ||
"Simplifying complex texts.", | ||
"Turning the complicated into the understandable.", | ||
"Working our magic on your document." | ||
]; | ||
|
||
// Define uploadFile globally | ||
window.uploadFile = function() { | ||
var form = document.getElementById('uploadForm'); | ||
var formData = new FormData(form); | ||
var summaryBlock = document.getElementById('summaryBlock'); | ||
var summaryOutput = document.getElementById('documentSummary'); | ||
|
||
// Display a random message | ||
var messageDiv = document.getElementById('randomMessage'); | ||
messageDiv.innerHTML = randomMessages[Math.floor(Math.random() * randomMessages.length)]; // Display initial random message | ||
var messageInterval = setInterval(function() { | ||
messageDiv.innerHTML = randomMessages[Math.floor(Math.random() * randomMessages.length)]; | ||
}, 5000); // Change message every 5 seconds | ||
|
||
fetch('/upload', { | ||
method: 'POST', | ||
body: formData, | ||
}) | ||
.then(response => response.json()) // Parse the JSON response | ||
.then(data => { | ||
if(data.summary) { | ||
console.log(data.summary) | ||
var converter = new showdown.Converter() | ||
var parsedHtml = converter.makeHtml(data.summary); | ||
summaryOutput.innerHTML = parsedHtml; // Display the recipe | ||
summaryBlock.style.display='block' | ||
messageDiv.style.display = 'none' // Clear message | ||
|
||
// Scroll to the documentSummary div | ||
document.getElementById('documentSummary').scrollIntoView({ | ||
behavior: 'smooth', // Smooth scroll | ||
block: 'start' // Align to the top of the view | ||
}); | ||
|
||
} else if (data.error) { | ||
summaryOutput.innerHTML = `<p>Error: ${data.error}</p>`; | ||
messageDiv.style.display = 'none' // Clear message | ||
} | ||
}) | ||
.catch(error => { | ||
console.error('Error:', error); | ||
summaryOutput.innerHTML = `<p>Error: ${error}</p>`; | ||
messageDiv.style.display = 'none' // Clear message | ||
}); | ||
}; | ||
}); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
<!doctype html> | ||
<html lang="en"> | ||
<head> | ||
<meta charset="utf-8"> | ||
<meta name="viewport" content="width=device-width, initial-scale=1"> | ||
<title>Legal Simplifier</title> | ||
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-T3c6CoIi6uLrA9TneNEoa7RxnatzjcDSCmG1MXxSR1GAsXEV/Dwwykc2MPK8M2HN" crossorigin="anonymous"> | ||
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/showdown.min.js"></script> | ||
<link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet"> | ||
</head> | ||
<body> | ||
<nav class="navbar navbar-expand-lg navbar-dark bg-dark"> | ||
<div class="container"> | ||
<a class="navbar-brand" href="#">Legal Simplifier</a> | ||
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation"> | ||
<span class="navbar-toggler-icon"></span> | ||
</button> | ||
<div class="collapse navbar-collapse" id="navbarSupportedContent"> | ||
<ul class="navbar-nav me-auto mb-2 mb-lg-0"> | ||
<li class="nav-item"> | ||
<a class="nav-link active" aria-current="page" href="#">Home</a> | ||
</li> | ||
<li class="nav-item"> | ||
<a class="nav-link" href="https://gptscript.ai">GPTScript</a> | ||
</li> | ||
</ul> | ||
</div> | ||
</div> | ||
</nav> | ||
|
||
<div class="container col-xl-10 col-xxl-8 px-4 py-5"> | ||
<div class="row align-items-center g-lg-5 py-5"> | ||
<div class="col-lg-4 text-center text-lg-start"> | ||
<h3 class="display-6 fw-bold lh-3 mb-4">Legal Document Simplifier</h3> | ||
<p class="fs-5">Upload your legal documents in PDF format, and let our tool simplify the content into easy-to-understand text. This simplification aims to make legal jargon accessible to everyone.</p> | ||
</div> | ||
<div class="col-lg-8 mx-auto"> | ||
<form id="uploadForm" class="p-4 p-md-5 border rounded-3 bg-light" enctype="multipart/form-data"> | ||
<input type="file" name="file" class="form-control" id="formFile" aria-describedby="inputGroupFileAddon04" aria-label="Upload"> | ||
<button class="w-100 btn btn-lg btn-primary" style="margin-top: 15px;" type="button" id="inputGroupFileAddon04" onclick="uploadFile()">Simplify It</button> | ||
<div id="randomMessage" style="margin-top: 10px;" class="mt-3"></div> | ||
</form> | ||
</div> | ||
</div> | ||
</div> | ||
|
||
<hr class="my-4"> | ||
<div class="container col-xl-10 col-xxl-8 px-4 py-5" id="summaryBlock" style="display: none;"> | ||
<div class="row"> | ||
<div class="col-12"> | ||
<h2 class="display-6" style="text-align: center;">Summary</h2> | ||
<div id="documentSummary" class="border rounded-3 p-4 bg-light"> | ||
<!-- The summarized document will be displayed here --> | ||
</div> | ||
</div> | ||
</div> | ||
</div> | ||
|
||
|
||
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script> | ||
<script src="https://cdn.jsdelivr.net/npm/@popperjs/[email protected]/dist/umd/popper.min.js"></script> | ||
<script src="https://stackpath.bootstrapcdn.com/bootstrap/5.0.0-alpha1/js/bootstrap.min.js"></script> | ||
<script src="{{ url_for('static', filename='js/script.js') }}"></script> | ||
</body> | ||
</html> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.