Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TWE: Research: Transcription Software Comparison #629

Closed
10 tasks done
pandanista opened this issue Oct 10, 2024 · 16 comments
Closed
10 tasks done

TWE: Research: Transcription Software Comparison #629

pandanista opened this issue Oct 10, 2024 · 16 comments

Comments

@pandanista
Copy link
Member

pandanista commented Oct 10, 2024

Overview

We need to compile a list of preferred transcription software services and compare their costs, allowed transcript quantity, transcription quality, etc.

Action Items

  • Use TWE: Template: Spreadsheet(linked in Resources 1.01) to create a new spreadsheet
  • Rename the spreadsheet as TWE: Research: Transcription Software Comparison
  • Save the spreadsheet in the Guides and How-Tos folder
    • The path is Internship > Internships > Research > Guides and How-Tos
  • In Resources # 2.01, copy and paste the URL of the spreadsheet between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
  • Use the newly created spreadsheet to document transcription software/services and its pros and cons
    • Do some research on the transcription software, and list out at least 5 transcription software commonly used by researchers
    • Use one row for a software or software tier. For example, Otter.ai has basic, pro, business, and enterprise
    • Use one column for a factor in the spreadsheet
      • Factors we should consider but not limited to: costs per month; how many transcripts or minutes are allowed; transcription quality (based on reviews or personal experience), etc.
      • Add other factors you deem important
  • List out the name and URL of the transcription software you compared in Resources gathered during the completion of the issue section
  • If further trial is needed, please use a personal email account, not internship or internship-ux Gmail account even if you have access to them
  • Summary your findings in comments in this issue
  • Review with UXR Leads
  • Review with Project Lead

Resources/Instructions

Resources for creating this issue

1.01 TWE: Template: Spreadsheet

Resources gathered during the completion of the issue

2.01 TWE: Research: Transcription Software Comparison
2.02 Otter.ai/Basic
2.03 Otter.ai/Pro
2.04 Otter.ai/Business
2.05 TurboScribe/Free
2.06 TurboScribe/Unlimited
2.07 Condens/Individual
2.08 Condens/Team
2.09 MeetGreek/Business
2.10 Userbit/Free
2.11 transcription software samples folder
2.12 RP012 Intern I005 Interview Recording Transcript - TurboScribe
2.13 RP012 Intern I005 Interview Recording Transcript - Otter

@pandanista
Copy link
Member Author

@mh-faraji I am assigning this issue to you to research on the transcription software that the org could adopt. Let me know if you have any questions. Thank you.

@pandanista
Copy link
Member Author

@mh-faraji: Another team member, @kevinmoutoucarpin, suggested condens.io. Let's look into that.

@kevinmoutoucarpin Thank you for suggesting Condens.

@mh-faraji
Copy link
Member

@pandanista I believe Otter.ai/Business offers the best value, with unlimited imports and transcriptions of audio or video files at a reasonable monthly price ($30 monthly). Another option could be Turboscribe.ai/Unlimited ($20 monthly) if we have 50 files or fewer. Other software, like GoTranscript, Vook.ai, and Sonix, follow a per-minute or pay-as-you-go pricing model, which can become costly.

@pandanista
Copy link
Member Author

@mh-faraji Thank you for looking into this. Let's go over those options in a bit more depth next Tuesday before the Thursday all team meeting.

@pandanista
Copy link
Member Author

Kevin suggested two more tools:
"For the transcription tools, you can also add :

  • MeetGeek : Free plan includes 5 hours of transcription/month, 3 month transcript storage and 1 month audio storage
  • UserBit : Free for your first project so it might be possible to upload all your audio/videos and get them transcribed for free as long as you don’t create a 2nd project. At least that’s how I understood. Plus, you can add unique words or phrases to leverage improved accuracy.
    You should also look at how each of these tools process data in terms of privacy and security (it's best if you can upload audios/videos without PIIs)"

One more factor we need to consider if we haven't already: Can the transcription or other files be downloaded and saved into our Google drive by using each of the transcription software/tools? We want to make sure we own the transcripts, not the software provider.

@pandanista pandanista moved this from In progress (actively working) to Questions/Review in P: TWE: Project Board Oct 15, 2024
@pandanista
Copy link
Member Author

pandanista commented Oct 16, 2024

@sunannie27 @bonniewolfe

@pandanista pandanista added Ready for product When the issue is ready for product team to review and removed ready for research lead labels Oct 16, 2024
@ExperimentsInHonesty ExperimentsInHonesty removed the Ready for product When the issue is ready for product team to review label Oct 17, 2024
@pandanista pandanista moved this from Questions/Review to In progress (actively working) in P: TWE: Project Board Oct 17, 2024
@pandanista
Copy link
Member Author

  • We will use TurboScribe's free version for transcribing the interviews under 30 mins as the free version allows.
    • We will keep track of how long each interview transcription takes, so we get a better idea how much time it takes.
  • When moving to transcribe interviews over 30 mins, we will decide between otter or turboscribe.

@ExperimentsInHonesty
Copy link
Member

@pandanista @mh-faraji I just added a new line to the TWE: Research: Transcription Software Comparison for https://MacWhisper.com which is free. You have to have a MAC to run it, but it does keep the recordings off of 3rd party sites, which is a benefit and a person could transcribe all the interviews in one day if they wanted, since there are no limits to the amount of transcribing you can do, or the length of the videos.

@mh-faraji
Copy link
Member

mh-faraji commented Oct 17, 2024

@pandanista @ExperimentsInHonesty Here are my updates after trying these three software: MacWhisper has no restrictions, but the downside is that its transcripts require a lot of time to clean up. The output is a single block of text, with sentences running together without identifying speakers or including timestamps. TurboScribe/Free allows you to download transcripts with a free account, but the issue is that it doesn’t label speakers or provide timestamps, making the clean-up process equally time-consuming. Otter.ai offers a better solution, as its transcripts include both speaker identification and timestamps. However, the free version doesn’t allow you to download the transcription, and needs a subscription.

@mh-faraji
Copy link
Member

@pandanista @ExperimentsInHonesty New update: I contacted Turboscribe’s customer service, and they mentioned an Advanced Export option that identifies speakers and includes timestamps. This feature appears to be available for the Free option as well. I tested it on a short YouTube video using a free account, and the transcription I downloaded included both speaker identification and timestamps. Yingjie, could you please assign an issue to me so I can start testing a few interviews and evaluate the transcription output?

@pandanista
Copy link
Member Author

@mh-faraji Thank you so much for the thorough updates.

@pandanista
Copy link
Member Author

pandanista commented Oct 23, 2024

As further discussed at the UXR meeting on 10/22, we decided to try out both Otter's free and TurboScribe's free version in the RP012's interview recordings to compare the transcript output quality so we can make a more informed decision on the transcription software.

The transcribing template's testing issues are created:

Mehdi will test both software by setting up tests, and we will review both transcripts once they are finished. Since the transcription tests will help with this software comparison issue, leaving this issue in the Review column until we finish the transcription tests to decided on the software.

@ExperimentsInHonesty
Copy link
Member

We decided to use TurboScribe, since we do not need the AI features of Otter AI.

@ExperimentsInHonesty ExperimentsInHonesty moved this from Questions/Review to In progress (actively working) in P: TWE: Project Board Oct 31, 2024
@ExperimentsInHonesty
Copy link
Member

I moved this issue back to in progress. I think you write a decision record for the wiki and then we can close this issue out.

Here is the template for the Decision Record

This is a record in the [Decision Records on Solutions Adopted](https://github.com/hackforla/website/wiki/Decision-Records-on-Solutions-Adopted).

#### Issue 
#### Problem Statement
#### Potential Solution
#### Feasibility Determination

On this page in the Records section you can find examples of decision records for the website team

@pandanista
Copy link
Member Author

pandanista commented Nov 14, 2024

@ExperimentsInHonesty
Since we don't have a decision record wiki page for TWE, we created a decision record page and DR: Transcription software choice page using the examples from the website team.

If everything looks good, we can close the issue.

@pandanista pandanista moved this from In progress (actively working) to Questions/Review in P: TWE: Project Board Nov 14, 2024
@pandanista pandanista added the Ready for product When the issue is ready for product team to review label Nov 14, 2024
@KC-skc
Copy link
Member

KC-skc commented Nov 21, 2024

Decision record has been accepted by product.

@KC-skc KC-skc closed this as completed Nov 21, 2024
@github-project-automation github-project-automation bot moved this from Questions/Review to Done in P: TWE: Project Board Nov 21, 2024
@KC-skc KC-skc removed the Ready for product When the issue is ready for product team to review label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment