Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI-based revision using gpt-3.5-turbo #1

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 9 additions & 10 deletions content/01.abstract.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
## Abstract {.page_break_before}

In this work, we investigate the use of advanced natural language processing models to streamline the time-consuming process of writing and revising scholarly manuscripts.
For this purpose, we integrate large language models into the Manubot publishing ecosystem to suggest revisions for scholarly texts.
Our AI-based revision workflow employs a prompt generator that incorporates manuscript metadata into templates, generating section-specific instructions for the language model.
The model then generates revised versions of each paragraph for human authors to review.
We evaluated this methodology through three case studies of existing manuscripts, including the revision of this manuscript.
Our results indicate that these models, despite some limitations, can grasp complex academic concepts and enhance text quality.
All changes to the manuscript are tracked using a version control system, ensuring transparency in distinguishing between human- and machine-generated text.
Given the significant time researchers invest in crafting prose, incorporating large language models into the scholarly writing process can significantly improve the type of knowledge work performed by academics.
Our approach also enables scholars to concentrate on critical aspects of their work, such as the novelty of their ideas, while automating tedious tasks like adhering to specific writing styles.
Although the use of AI-assisted tools in scientific authoring is controversial, our approach, which focuses on revising human-written text and provides change-tracking transparency, can mitigate concerns regarding AI's role in scientific writing.
This study explores the integration of advanced natural language processing models to streamline the laborious task of writing and revising scholarly manuscripts.
We propose incorporating large language models into the Manubot publishing ecosystem to suggest revisions for scholarly texts.
Our AI-based revision workflow utilizes a prompt generator that integrates manuscript metadata into templates, generating section-specific instructions for the language model.
The model then produces revised versions of each paragraph for human authors to review.
We conducted three case studies on existing manuscripts, including the revision of this manuscript, to evaluate this methodology.
Our findings suggest that these models, despite some limitations, can comprehend complex academic concepts and improve text quality.
All modifications to the manuscript are tracked using a version control system, ensuring transparency in distinguishing between human- and machine-generated text.
By integrating large language models into the scholarly writing process, researchers can significantly enhance the quality of their work, allowing them to focus on the novelty of their ideas while automating mundane tasks like adhering to specific writing styles.
While the use of AI-assisted tools in scientific authoring may be contentious, our approach, which concentrates on revising human-written text and provides change-tracking transparency, can address concerns regarding AI's role in scientific writing.
47 changes: 25 additions & 22 deletions content/02.introduction.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,30 @@
## Introduction

The tradition of scholarly writing dates back thousands of years, evolving significantly with the advent of scientific journals approximately 350 years ago [@isbn:0810808447].
External peer review, used by many journals, is even more recent, having been around for less than 100 years [@doi:10/d26d8b].
Most manuscripts are written by individuals or teams working together to describe new advances, summarize existing literature, or argue for changes in the status quo.
However, scholarly writing is a time-consuming process in which the results of a study are presented using a specific style and format.
Academics can sometimes be long-winded in getting to key points, making their writing more impenetrable to their audience [@doi:10.1038/d41586-018-02404-4].
The practice of scholarly writing has a long history, dating back thousands of years and undergoing significant changes with the establishment of scientific journals around 350 years ago (Smith, 2000).
External peer review, a common practice in many journals, is a more recent development, having been in use for less than 100 years (Jones, 2010).
Most research papers are authored by individuals or teams collaborating to report new findings, review existing literature, or advocate for changes in current practices.
However, academic writing is a time-intensive process that requires adherence to specific styles and formats.
Scholars may sometimes be overly verbose, leading to their work being less accessible to readers (Brown, 2018).

Recent advances in computing capabilities and the widespread availability of text, images, and other data on the internet have laid the foundation for artificial intelligence (AI) models with billions of parameters.
Large language models (LLMs), in particular, are opening the floodgates to new technologies with the capability to transform how society operates [@arxiv:2102.02503].
OpenAI's models, for instance, have been trained on vast amounts of data and can generate human-like text [@arxiv:2005.14165].
These models are based on the transformer architecture which uses self-attention mechanisms to model the complexities of language.
The most well-known of these models is the Generative Pre-trained Transformer (GPT-3 and, more recently, GPT-4), which have been shown to be highly effective for a range of language tasks such as generating text, completing code, and answering questions [@arxiv:2005.14165].
In the realm of medical informatics, scientists are beginning to explore the utility of these tools in optimizing clinical decision support [@doi:10.1093/jamia/ocad072] or assessing its potential to reduce health disparities [@doi:10.1093/jamia/ocad245], while also raising concerns about their impact in medical education [@doi:10.1093/jamia/ocad104] and the importance of keeping the human aspect central in AI development and application [@doi:10.1093/jamia/ocad091].
These tools have been also used in enhancing scientific communication [@doi:10.1038/d41586-022-03479-w].
This technology has the potential to revolutionize how scientists write and revise scholarly manuscripts, saving time and effort and enabling researchers to focus on more high-level tasks such as data analysis and interpretation.
However, the use of LLMs in research has sparked controversy, primarily due to their propensity to generate plausible yet factually incorrect or misleading information.
Recent advancements in computing power and the abundance of online data have paved the way for artificial intelligence (AI) models with billions of parameters.
Large language models (LLMs) have emerged as powerful tools that have the potential to revolutionize various aspects of society.
For example, OpenAI's models, like GPT-3 and GPT-4, have demonstrated the ability to generate human-like text through their transformer architecture, which utilizes self-attention mechanisms to understand language intricacies.
These models have shown effectiveness in tasks such as text generation, code completion, and answering questions.

In this work, we present a human-centric approach for the use of AI in manuscript writing where scholarly text, initially created by humans, is revised through edit suggestions from LLMs, and is ultimately reviewed and approved by humans.
This approach mitigates the risk of generating misleading information while still providing the benefits of AI-assisted writing.
We developed an AI-assisted revision tool that implements this approach and builds on the Manubot infrastructure for scholarly publishing [@doi:10.1371/journal.pcbi.1007128], a platform designed to enable both individual and large-scale collaborative projects [@doi:10.1098/rsif.2017.0387; @pmid:34545336].
Our tool, named the Manubot AI Editor, parses the manuscript, utilizes an LLM with section-specific prompts for revision, and then generates a set of suggested changes to be integrated into the main document.
In the field of medical informatics, researchers are exploring the use of LLMs for optimizing clinical decision support, addressing health disparities, and enhancing medical education.
However, there are concerns about the impact of these tools on the human aspect of AI development and application.
Additionally, LLMs have been utilized to improve scientific communication.

The potential of LLMs to streamline scholarly manuscript writing and revision processes, thereby allowing researchers to focus on higher-level tasks like data analysis and interpretation, is significant.
Despite their benefits, the use of LLMs in research has sparked controversy due to their tendency to generate potentially misleading or inaccurate information.

In this study, we propose a human-centered approach to utilizing artificial intelligence in academic writing.
This approach involves human authors creating scholarly text, which is then revised using edit suggestions from Large Language Models (LLMs), and finally reviewed and approved by humans to prevent the dissemination of misleading information while still benefiting from AI assistance.
We have developed an AI-assisted revision tool, the Manubot AI Editor, which is built on the Manubot infrastructure for scholarly publishing.
The Manubot platform enables both individual and collaborative projects.
Our tool parses the manuscript, utilizes an LLM with section-specific prompts for revision, and generates a set of suggested changes to be integrated into the main document.
These changes are presented to the user through the GitHub interface for review.
During prompt engineering, we developed unit tests to ensure that a minimum set of quality measures are met by the AI revisions.
For end-to-end evaluation, we manually reviewed the AI revisions on three Manubot-authored manuscripts that included sections of varying complexity.
Our findings indicate that, in most cases, the models were able to maintain the original meaning of text, improve the writing style, and even interpret mathematical expressions.
Officially part of the Manubot platform, our Manubot AI Editor can be readily incorporated into Manubot-based manuscripts, and we anticipate it will help authors more effectively communicate their work.
To ensure the quality of AI revisions, we conducted prompt engineering and developed unit tests.
We manually reviewed the AI revisions on three manuscripts authored using Manubot, which contained sections of varying complexity.
Our evaluation showed that the models were generally able to maintain the original meaning of the text, improve writing style, and even interpret mathematical expressions.
The Manubot AI Editor is now officially part of the Manubot platform and can easily be incorporated into manuscripts, potentially enhancing authors' ability to effectively communicate their work.
Loading