-
Notifications
You must be signed in to change notification settings - Fork 1
Overview of Features
Hi, welcome! If you're reading this page, you're probably wondering exactly how ALICE functions from a user's perspective. Here's a guide of what its features are, so you know how things should work.
(If you're a dev, you can use this document to tell if something's gone wrong and figure out what to fix.)
In summary:
- ALICE is meant for "editors" who want to review transcriptions submitted by Zooniverse volunteers.
- So this app's use comes after a typical Zooniverse "transcription project" has finished, or at least has begun to retire subjects.
- ALICE's major features can be divided into its respective pages.
- ALICE requires a few external dependencies to work properly:
- Panoptes for authentication and for Subject data
- TOVE, which provides an index of Projects/Workflows owned by the user, keeps track of the Groups (analogous to Subject Sets) belonging to a Workflow, and keeps track of the edited/reviewed state of each Subject. (i.e. the Transcription resource.)
- Caesar for aggregation data for each Subject (i.e. the submitted transcriptions from volunteers, which the editor then reviews)
- ALICE requires you to have a very specific user and project setup to make any sense.
Note: this document is written for both technical and non-technical readers. If you're a web developer, look for the 🕸️ symbol for technical details. If you're not a dev, feel free to ignore anything marked with the symbol.
The Zooniverse project must be set up as follows:
- The project must have a transcription-type workflow that includes line-based positional data (Transcription Task or Line Marking + Text Entry sub-task).
- The project must be registered on TOVE. (❗ I have no idea how to do this, and this should be documented. @shaunanoordin 20220822) [nb I believe this is automated but let's confirm w/ zach @snblickhan 20220829]
- The project's transcription workflow(s) must each have an "aggregation data" setup on Caesar. (❗ I have a vague idea how to do this, and this should be documented too. @shaunanoordin 20220822) [nb if helpful here you can point to https://panoptes-python-client.readthedocs.io/en/latest/panoptes_client.html#panoptes_client.workflow.Workflow.configure_for_alice @snblickhan 20220829]
Note: you may want to see the ALICE About/Help page and the README for additional info.
The Zooniverse user must be set up as follows:
- The user must be a member of the project.
- Different roles have different access levels. (More info here.)
- 🕸️ If you're a developer, try to be an owner of or collaborator on the project to access full functionality.
For example, on staging, we have the following test user + test project set up:
- User
zootester1
is a collaborator on "Anti-Slavery Testing" (originally a test project for Anti-Slavery Manuscripts) - The "Anti-Slavery Testing" project (ID 1764, slug
wgranger-test/anti-slavery-testing
) is registered on TOVE staging and has several Workflows, Groups (sort of but not quite Subject Sets), and Subjects already indexed.
As a developer, you can either use the zootester1 user account to test the following features, or add yourself as a collaborator to wgranger-test/anti-slavery-testing
. (last checked 2022.08.22)
As mentioned, ALICE's functionality and features are organised by pages.
- Path: (root)
- Context: As a user, I should be able to understand what website/app I'm looking at, and assuming I know the function of the website/app, I should be able to login.
The first page the user sees. Has some info about the project, and a login form.
- Accessible to any user. (Note: if a non-logged in user tries to access a protected page, they get redirected here.) [what does a 'protected page' mean in this context? @snblickhan 20220829]
⚠️ UI/UX QUIRK: If you're already logged in, the home page STILL has a login form. There's no indication that you're logged in, because the app assumes you'll never navigate to the home page again after you've already logged in.
- Path:
/about
- Context: As a user, I should be able to learn more about how this website/app works. [this is a very good point -- I will make a note to myself to write an opening paragraph introducing readers to the "what" and "why" of ALICE @snblickhan 20220829]
Standard content page. Less a "What's this website about?" page (despite the path name) and more a "How do I use this website?" page.
- Accessible to any user.
- Accessible via top-right menu, under "HELP"
- 🕸️ No notable functionality (i.e. nothing can break), but content should be checked to ensure it makes sense to the user.
- Path:
/projects
- Context: As an editor (user), I should be able to choose which of my Projects to work on.
Index/listing page. Lists all Projects belonging to the user.
- 🔒 Protected page (logged-in users only)
- Accessible either via...
- Logging in from the home page (this is the first page you're redirected to upon a successful login)
- Top-right menu, under "Viewer"
- Breadcrumbs near the top of this page's "child pages" (Workflows Page, etc)
- Has two separate lists:
- A list of "Your Projects" (where you're the owner/collaborator)
- A list of "Collaborations" (Where you're the Expert/Researcher/Moderator/Tester)
- Does NOT have pagination.
🕸️ Data sources:
- Pulls a list of Projects (that the user is a member of) from TOVE.
- Pulls corresponding "Project" resources from Panoptes.
- Pulls corresponding "Project Roles" from Panoptes.
🕸️ Possible states:
- Loading state, with spinner.
- Error state, with error message.
- Success state, with projects listed.
- "No data" success state, with the message: "We couldn't find any transcription projects you participate in." [is there a way to generate an example of this, or share a screenshot? I honestly did not know it existed and thought the sections were just empty. @snblickhan 20220829]
- Path:
/projects/{PROJECT_ID}/workflows
(🕸️ example) - Context: As an editor (user), I should be able to choose which Workflow to work on.
Index/listing page. Lists all Workflows belonging to the Project.
Note: Workflows, Groups, and Subjects Pages share a lot of functionality.
- 🔒 Protected page
- Accessible via selecting a Project on Project Page. (Or via breadcrumb on child pages.)
- No searching nor sorting.
- Has pagination.
🕸️ Data sources:
- Pulls Workflows from TOVE.
- These TOVE Workflows differ from Panoptes Workflows by having additional data, e.g. "transcription_count" and, more importantly...
- ...a list of Groups.
- Side note: also pulls Project and Project Roles from Panoptes, if necessary. (e.g. if user opens page directly via URL instead of navigating from Projects Page.)
🕸️ Possible states:
- Loading state, with spinner.
- Error state, with error message.
- Success state, with workflows listed.
- "No data" success state, with the message: "Sorry, we couldn't find any Workflows."
- Path:
/projects/{PROJECT_ID}/workflows/{WORKFLOW_ID}/groups
(🕸️ example) - Context: As an editor (user), I should be able to choose which Group to work on.
Index/listing page. Lists all Groups belonging to the Workflow.
- 🔒 Protected page
- Accessible via selecting a Workflow on Workflows Page. (Or via breadcrumb on child pages.)
- No searching nor sorting.
- Has pagination.
🕸️ Data sources:
- Same as Workflows Page! (There's no separate endpoint for pulling a list of Groups.) [Useful to add info here about how Groups are created via the
group_id
field in the subject manifest? @snblickhan 20220829] - (Note: the Request path+query to TOVE changes slightly depending on whether you access the Groups Page via the Workflows Page or via direct URL, but the results are the same.)
🕸️ Possible states:
- Same as Workflows Page, except replace "workflows" with "groups." [will this actually happen? if no
group_id
field is present in the subject manifest, all data will go to a group automatically nameddefault
-- what's the use case for a 'no groups found here' error? I'm guessing a technical issue maybe? @snblickhan 20220829]
- Path:
/projects/{PROJECT_ID}/workflows/{WORKFLOW_ID}/groups/{GROUP_ID}/subjects
(🕸️ example) - Context: As an editor (user), I should be able to find and choose which Subject/Transcription to work on, and view the status of each Subject/Transcription.
Index/listing page. Lists all Subjects belonging to the Group. Or, to be accurate, this lists all Transcriptions belonging to the Group.
(🕸️ Note: from a user's perspective, a Transcription resource is the same as a Subject resource, since they have a 1-to-1 pairing. As a dev however, it's important to separate the Transcriptions on TOVE from the Subjects on Panoptes, as they're two separate data resources.)
- 🔒 Protected page
- Accessible via selecting a Group on Groups Page. (Or via breadcrumb on child pages.)
- Has pagination.
- Has sorting:
- Each column can be sorted ascending/descending.
- Sorting is done server-side.
- Has searching/filtering:
- "Search" button at the top of the page opens a modal, which allows users to specify search/filter parameters for the Subjects page.
- Searching/filtering is done server-side.
Additional functionality: Download Approved Group Data
- Clicking on "Download Approved Group Data" on the top opens a modal, which allows users to request all approved Transcriptions done in this Group.
- This sends a request to TOVE to provide a ZIP file in response.
Data sources:
- Pulls Transcriptions from TOVE.
- Side note: also pulls Project and Project Roles from Panoptes; and Workflows from TOVE; if necessary.
- For the "Download Approved Group Data" functionality, there's a separate endpoint on TOVE.
Possible states:
- Same as Workflows Page, except replace "workflows" with "subjects/transcriptions."
- Path:
/projects/{PROJECT_ID}/workflows/{WORKFLOW_ID}/groups/{GROUP_ID}/subjects/{SUBJECT_ID}/edit
(🕸️ example) - Context: As an editor (user), I should be able to (1) view a Subject, (2) view the Transcriptions attached to it, and (3) edit the Transcriptions.
Feature-heavy interactive page. Allows the editor to view the Subject and edit the Transcription data.
- 🔒 Protected page
- Accessible via selecting a Subject on Subjects Page.
- 🖊️ This is the only page that can write to TOVE.
- Divided into multiple sub-sections. (see below)
🕸️ Data sources:
- Pulls Subject from Panoptes.
- Pulls Transcriptions from TOVE.
- 🖊️ Updates Transcriptions to TOVE. (PATCH requests, mostly)
- Pulls Aggregations from Caesar.
- Side note: also pulls Project and Project Roles from Panoptes; and Workflows from TOVE; if necessary.
🕸️ Advanced Notes:
- If a Subject does not have a corresponding Transcription resource on TOVE,
- an empty/default Transcription is created on ALICE.
- Aggregated data from Caesar is pulled to populate the empty/default Transcription on ALICE, creating the lines of text the user sees.
- When a change is made, the Transcription is saved to (i.e. created on) TOVE. (
⚠️ Requires confirmation - somebody please test with a fresh Subject.)
- If a Subject does have a corresponding Transcription resource on TOVE,
- the Transcription data is pulled to ALICE.
- ❓ Aggregated data from Caesar is essentially ignored? Because the editor-modified lines of text take precedence? (
⚠️ Requires confirmation.) - When a change is made, the existing Transcription on TOVE is updated.
Section: Editor Header
Found at the top of the page, adds additional options for the editor (user).
- Shows the "autosave" status of the Transcription. Usually says "All changes saved."
- User can mark the Transcription as approved or otherwise.
- Context: an "approved" Transcription indicates that an editor (user) thinks that no further changes are necessary on any of this document's pages.
- A Transcription that's been approved can no longer have its contents edited.
- You can un-approve a Transcription and then continue editing, though. This will prompt a confirmation modal.
- User can undo the most recent content edit. (e.g. rearranging lines of text)
- User can change the layout of the Subject Viewer and Aggregated Transcriptions Panel, switching from horizontal (side-by-side) to vertical.
- Context: this is useful when a Subject image is wide instead of tall, e.g. a postcard.
- User can download Subject data, if the Transcription is approved.
- The ZIP should contain (1) consensus data, which is all the text from the Transcription, flattened into a TXT file
- Context: the consensus data is useful for libraries/catalogues/etc who aren't interested in how the lines of text are spatially positioned on a page; their database only wants the text on the page to be transcribed and readable.
- The ZIP should also contain (2) metadata on the Transcription itself (e.g. who approved it), (3) full data on the transcribed lines (i.e. text, spatial position, and confidence level), and (4) the "raw data" JSON of the Transcription.
- User can edit Aggregation Settings for individual Subjects, if the Transcription isn't approved.
- Changing the aggregation settings changes how the aggregated transcribed lines are "calculated" from the groups of individual volunteer-submitted lines. This is an advanced feature.
- This will also reset any editor-made changes (e.g. addition of additional lines, rearrangement of lines, etc).
🕸️ Possible states:
⚠️ None. The Transcription fetch isn't explicitly monitored for loading/success/error states; the UI just changes (e.g. Approved button suddenly flipping on) when data is successfully fetched.
Section: Subject Viewer
Shows the Subject image. On the default layout, appears on the left of the page.
- User can zoom (via UI buttons), pan (via click-and-drag), and rotate (via UI button) the image.
- If the Subject has transcribed lines of text, then those Transcription Lines (TL) will appear visually over the image as multi-coloured lines.
- User can show/hide these visual lines.
- User can click on a line to select it. (Same as selecting the corresponding TL on the Aggregated Transcriptions Panel)
- When one TL is selected, all other visual lines are hidden.
- For Subjects with multiple images, this initially shows the first image.
- The Filmstrip Viewer (below) can be used to select other images in the Subject.
- Note: assumption is that every Transcription project uses multi-image Subjects. No video or audio file support is available.
🕸️ Possible states:
- Loading state, with "Loading Subject..." message.
- Error state, with error message.
- Success state, with Subject image visible.
- Quirk: it's possible for the error message to appear ON TOP of a fully loaded Subject image.
- Quirk: when a Subject loads, but the image file(s) is unavailable, there's no explicit error message.
Section: Aggregated Transcriptions Panel
Shows aggregated lines of text from the Transcription for the Subject. On the default layout, appears on the right of the page.
- User (editor) can select a line by clicking on it. This opens the "Line Viewer" modal.
- User can change what the line of text says, either by (1) selecting text submitted by the volunteers or (2) submitting their own text. By default, the "aggregated text" generated by processing all the volunteer-submitted text is chosen. Note: after making a choice, clicking the "replace with selected" button must be clicked to confirm the change.
- User can add another line after this (see "add a line" below), or delete the current one.
- User can mark a line of text as "already seen". Context: this indicates to other editors that this line probably doesn't need further review or reminds a reviewer where they've left off, if reviewing over multiple sessions.
- User can flag a line of text. Context: this indicates to other editors lines that aren't particularly clear or are otherwise problematic, and would benefit from additional review.
- When an aggregated line of text is selected, the Subject Viewer will only show the corresponding visible line. All other visible lines on the Subject Viewer are hidden.
- User (editor) can rearrange lines by dragging them.
- User (editor) can add a line if necessary.
- These editor-added lines are different from volunteer-aggregated lines in many ways.
- For example, volunteer-aggregated lines have a corresponding visual line on the Subject Viewer, while the editor-added lines don't.
- Context: this allows users (editors) to transcribe text that volunteers may have missed. This is useful for generating consensus data which isn't interested in the spatial positioning of text on a page.
Screenshot: Editor page, with a line of text selected. (Line Viewer) Note: the green dot (already seen) symbol is smooshed too closely to the red flag symbol, this is a UI error.
🕸️ Possible states:
⚠️ Quirk/Questionable. Either (1) you see lines of text, or (2) you see the message "This page does not contain transcription data." [User case: sometimes projects have pages w/o text, OR there's user error and a page gets marked as complete without any positional data/transcription data provided @snblickhan 20220829]- The Transcription fetch isn't explicitly monitored for loading/success/error states; the UI just changes (e.g. lines of text suddenly appearing) when data is successfully fetched.
- As a result, Option 2 might indicate either (a) the Subject has no transcription data, (b) there was a problem fetching the data, or (c) the website is still trying to fetch the data.
- Note: if the Transcription resource is unavailable (e.g. artificially blocked via dev network editing), the whole page might lock with the "Subject locked: This subject cannot be accessed because is currently accessing it." alert/modal.
Section: Filmstrip Viewer (aka Pagination for Pages)
For Subjects with multiple images (i.e. letters with multiple pages), this section visually lists all of them as thumbnails. Appears at the bottom of the page.
- User can select the active page by clicking on the thumbnail. (Panoptes terminology: user can switch the active Subject frame)
- User can rearrange pages by drag-and-dropping.
- User can delete pages.
Please see Anatomy of a Transcription, Dividing a Transcription into pages and slopes for additional info on how pages are grouped, and why pages with multidirectional text will have multiple thumbnail images in the filmstrip viewer.
Additional Notes
- Panel resizing: the size of the Subject Viewer vs the Aggregated Transcriptions Panel can be changed by dragging the "handle" on the dividing bar.
- Autosave: changes to the Transcription are are saved (i.e. written to TOVE) automatically on every data-changing action.
- Locking: when an editor views or edits a Transcription, that Transcription is locked to other editors. This prevents multiple editors from simultaneously viewing/editing the same Subject. Locks expire after a period of idle time.