Added example: meeting conversations extractor #1009

Ashish-Abraham · 2024-11-07T18:31:14Z

Context

Valuable business insights are often hidden in daily conversations across organizations, from customer interactions to internal meetings. Had an idea to develop something using Indexify that helps extract and utilize this data effectively.

Here I have added an example use-case. This is a conversation extractor that uses a custom indexify extraction graph to extract summarized data in a structured format from long meeting audio files.

What

The extractor workflow is as given:

Audio Processing:
- Transcription: Converts speech to text using Faster Whisper.
- Meeting Classification Router: Uses LLM to determine the type of meeting and routes control to corresponding node of compute graph.
Content Analysis:
Based on the meeting type classification, the system generates structured summaries:
- Strategy Meetings: Key decisions, action items, and strategic initiatives
- Sales/Marketing/Product Calls: Customer details, pain points, and next steps
- R&D Brainstorms: Innovative ideas, technical challenges, resource requirements, and potential impacts

You can tweak the fields to extract whatever data needed.

Sample Outputs

2024-11-10 22:41:27,465 - INFO - Transcription Classification: sales-call
2024-11-10 22:41:27,471 - INFO - 
Extracted information:
Meeting ID: RD-20241110-224127
Date: 2024-11-10 22:41:27
Duration: 538 seconds
Participants: None
Meeting Type: Sales Call
Customer Pain Points: []
Proposed Solutions: ['Having fiber connection, faster internet speeds for 4K streaming (up to 800 megabits), 100 megabits for only $25 with free installation fee']
Objections: []
Next Steps: ['(Empty response for 1 and 3. Added some data in the 2 and 4 category based on provided transcript response)', "Candice will call back at 7 in the evening and after talking with Vanessa's husband to answer few questions. Candice will send a link to sign up for the 100 megabits plan which Vanessa will fill out to complete the purchase"]

2024-11-10 22:51:40,001 - INFO - Transcription Classification: strategy-meeting
2024-11-10 22:51:40,006 - INFO - 
Extracted information:
Meeting ID: Strategy-20241110-225139
Date: 2024-11-10 22:51:39
Duration: 97 seconds
Participants: None
Meeting Type: Strategy Meeting
Key Decisions: ['Host a pancake breakfast next week to encourage students to come to school on Fridays', "Put up posters with tips on not getting sick since it's almost flu season", 'Refer John Smith to the guidance counselor for support', "Look for free or low-cost community resources to help John Smith's family."]
Risk Assessments: ['Chronically absent students', 'Students getting sick due to the cold weather.']
Strategic Initiatives: ['Improving student attendance', 'Promoting health and hygiene practices among students.']
Action Items: ['Plan and host a pancake breakfast next week', 'Create and put up posters with tips on not getting sick', 'Refer John Smith to the guidance counselor', "Research and share free or low-cost community resources with John Smith's family."]

Testing

Local Installation - In Process

Clone this repository:

git clone https://github.com/tensorlakeai/indexify
cd indexify/examples/conversation_extraction

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate

Install the required dependencies:
```
pip install -r requirements.txt
```
Run the main script:
```
python main.py --mode in-process-run
```

Contribution Checklist

If the python-sdk was changed, please run make fmt in python-sdk/.
If the server was changed, please run make fmt in server/.
Make sure all PR Checks are passing.

This reverts commit 570371e.

…o meeting_conversations_extractor

diptanu · 2024-11-08T17:12:55Z

@Ashish-Abraham How is this going? Are you blocked on antyhing?

Ashish-Abraham · 2024-11-09T17:56:12Z

No issues @diptanu . Was a little busy. Will complete it soon. Thanks!

diptanu · 2024-11-09T20:13:23Z

@Ashish-Abraham Did you see our example here - https://github.com/tensorlakeai/indexify/tree/main/examples/video_summarization -- I am wondering what is the difference in this demo vs what's on there?

Ashish-Abraham · 2024-11-10T17:39:31Z

Sorry. Added the wrong file. Here we are extracting the summary in structured format defined by the schema of each meeting type. This data structure can be passed to the frontend or processed further in any manner required. Please check.

Should I convert to JSON or sth?

diptanu · 2024-11-10T17:44:54Z

@Ashish-Abraham Yeah if you use JSON it might be easier for people to consume the workflow using HTTP APIs directly. Add encoder='json' in your decorators and function classes. Also, please add a video link you have used to test this so that people get the best result first when they try out the example :)

After that you could do something like to invoke the workflow

curl -X POST -H"Content-Type: application/json` http://localhost:8900/namespaces/default/compute_graphs
-d '{....}'

and

curl -X GET http://localhost:8900/namespaces/default/compute_graphs/<cg>/invocations/<invoction>/fn/<fn_name>

I don't quite remember the APIs correctly, they are in code and on our website.

Ashish-Abraham · 2024-11-19T17:54:23Z

@Ashish-Abraham Yeah if you use JSON it might be easier for people to consume the workflow using HTTP APIs directly. Add encoder='json' in your decorators and function classes. Also, please add a video link you have used to test this so that people get the best result first when they try out the example :)

After that you could do something like to invoke the workflow
curl -X POST -H"Content-Type: application/json` http://localhost:8900/namespaces/default/compute_graphs
-d '{....}'
and
curl -X GET http://localhost:8900/namespaces/default/compute_graphs/<cg>/invocations/<invoction>/fn/<fn_name>
I don't quite remember the APIs correctly, they are in code and on our website.

I cant find the page you are referring to. Is this the page? https://docs.tensorlake.ai/api-reference/documents/extract/extract-file-sync. Could you please guide me a bit on how to do this?

Ashish-Abraham added 10 commits November 5, 2024 11:14

added improved prompts for better consistent results

570371e

Revert "added improved prompts for better consistent results"

2db61f2

This reverts commit 570371e.

added improved prompts that gave better consistent results in workflow

00fec5a

added improved prompts that gave better consistent results in workflow

7c63e38

Merge branch 'main' of https://github.com/Ashish-Abraham/indexify int…

3dd42a3

…o meeting_conversations_extractor

added workflow

96729b3

added docker-compose.yml

3bc15b9

added requirements

e91410b

added readme

48ff9da

typos and corrections

24a2670

Ashish-Abraham changed the title ~~Added example meeting conversations extractor~~ Added example: meeting conversations extractor Nov 7, 2024

Ashish-Abraham marked this pull request as ready for review November 9, 2024 18:09

fixes and typos

cdaa162

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added example: meeting conversations extractor #1009

Added example: meeting conversations extractor #1009

Ashish-Abraham commented Nov 7, 2024 •

edited

Loading

diptanu commented Nov 8, 2024

Ashish-Abraham commented Nov 9, 2024

diptanu commented Nov 9, 2024

Ashish-Abraham commented Nov 10, 2024 •

edited

Loading

diptanu commented Nov 10, 2024 •

edited

Loading

Ashish-Abraham commented Nov 19, 2024

Added example: meeting conversations extractor #1009

Are you sure you want to change the base?

Added example: meeting conversations extractor #1009

Conversation

Ashish-Abraham commented Nov 7, 2024 • edited Loading

Context

What

Sample Outputs

Testing

Local Installation - In Process

Contribution Checklist

diptanu commented Nov 8, 2024

Ashish-Abraham commented Nov 9, 2024

diptanu commented Nov 9, 2024

Ashish-Abraham commented Nov 10, 2024 • edited Loading

diptanu commented Nov 10, 2024 • edited Loading

Ashish-Abraham commented Nov 19, 2024

Ashish-Abraham commented Nov 7, 2024 •

edited

Loading

Ashish-Abraham commented Nov 10, 2024 •

edited

Loading

diptanu commented Nov 10, 2024 •

edited

Loading