Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp speech configuration documentation; add examples #123

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
**/.definition
**/.preview/**
.DS_Store
42 changes: 42 additions & 0 deletions fern/customization/conversational-analysis.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
title: Conversational Analysis
subtitle: Understanding the Anatomy of Conversation as it relates to Speech Recognition
slug: customization/conversational-analysis
---

### Introduction


Conversation Analysis (CA) examines the structure and organization of human interactions, focusing on how participants manage conversations in real-time. We mimic this natural behavior in our API.

Key concepts include:

<AccordionGroup>

<Accordion title="Turn-Taking Organization">
Conversations are structured into turns, where typically one person speaks at a time. Speakers use Turn Construction Units (TCUs)—such as words, phrases, or clauses—that listeners recognize, allowing them to anticipate when a turn will end and when it's appropriate to speak. Transition Relevance Places (TRPs) are points where a change of speaker can occur. Turn allocation follows specific rules:

- **Current speaker selects next**: The current speaker designates who speaks next.
- **Self-selection**: If not selected, another participant may self-select to speak.
- **Continuation**: If no one else speaks, the current speaker may continue.

Silences are categorized as pauses (within a turn), gaps (between turns), or lapses (when no one speaks).
</Accordion>
<Accordion title="Sequence Organization">
Conversations often involve sequences like adjacency pairs, where an initial utterance (e.g., a question) prompts a related response (e.g., an answer). These pairs can be expanded with pre-sequences (preparing for the main action), insert expansions (occurring between the initial and responsive actions), and post-expansions (following the main action).
</Accordion>
<Accordion title="Preference Organization">
Certain responses are socially preferred. For example, agreements or acceptances are typically delivered promptly and directly, while disagreements or refusals may be delayed or mitigated to maintain social harmony.
</Accordion>
<Accordion title="Repair Mechanisms">
Participants address problems in speaking, hearing, or understanding through repair strategies. Self-repair (the speaker corrects themselves) is generally preferred over other-repair (another person corrects the speaker), helping to maintain conversational flow and mutual understanding.
</Accordion>
<Accordion title="Action Formation">
Speakers perform actions (e.g., questioning, requesting, asserting) through their utterances. Understanding how these actions are constructed and interpreted is central to CA, as it reveals how participants achieve social objectives through conversation.
</Accordion>
<Accordion title="Adjacency Pair">
An adjacency pair is a fundamental unit of conversation consisting of two related utterances. The first part (e.g., a question) typically elicits a specific response (e.g., an answer). These pairs are essential for structuring conversations and ensuring coherence.
</Accordion>
</AccordionGroup>

These foundational structures illustrate how individuals collaboratively produce and interpret talk in interaction, ensuring coherent and meaningful communication.
Loading
Loading