Refactor CodeChange code #91

PCSwingle · 2023-09-12T18:58:41Z

Refactoring CodeChange into a concrete and abstract format; the concrete format would manage parsing the model's output and would be able to convert into an abstract change. The abstract change would use an internal format and would handle actually changing files. This would allow us to easily create new formats and test them out, without having to change anything outside of the parsing (and maybe streaming?) logic. After talking with @biobootloader here is the current plan:

Overview

Concrete Change: Contains logic for parsing different model formats; can be converted to an abstract change. Can also be converted from an abstract change so that we can automatically generate examples for different models (and possibly tests) from a single example set.

Additions/Deletions: Make up an Abstract Change. These will be the simplest possible changes, either adding a block of code at a certain line or deleting a certain block of code. Ideally any conflicts will be resolved by the concrete change parser, but even if they aren't the Additions and Deletions will be processed in order and so conflicts shouldn't be an issue.

Abstract Change: Will be stored per file; each file will contain a list of Additions and Deletions; additionally, each Addition and Deletion will be marked with a specific Concrete Change; this will allow the user to use interactive mode to accept or reject (and eventually maybe ask the model to change?) specific changes that it outputs.

Plan

Create Abstract Changes, Additions/Deletions, and modify CodeFileManager to use them when applying changes
Create tests for Abstract Changes
Create Concrete Change basic interface
Look into seeing if possible to have unified streaming logic for different Concrete Change formats; if so, create it
Migrate current format parsing and streaming over to Concrete Change
Test current format Concrete Change
Create two-way conversion between Abstract Change and Concrete Change
Add testing for conversion
Add un-parsing for current format Concrete Change -- this will be used for tests and prompt examples
Create Abstract Change examples and tests, and use automatic conversion to Concrete Change all the way to pre-parsed text for both prompt examples and tests

Possible alternative?

Another design choice we could have made is to have Abstract Changes represent an entire file, and simply be the text of the file after applying all changes rather than having Additions and Deletions; this would resolve some problems with conflicts and renaming, but wouldn't let us be able to reference specific Concrete Changes from the Abstract Changes, meaning we would mostly likely end up simply using hunks (groups of changed lines) in interactive mode. For now, @biobootloader and I have decided to go with the previous plan to allow interactive mode to be more powerful.

PCSwingle · 2023-09-12T23:38:03Z

Slight change to the plan for AbstractChange after thinking and working on it for a bit: Planning on having Additions, Deletions, and Rename subchanges -- the Rename change will have a new name where None will represent a deletion, and a Rename when the file does not exist will represent a creation

jakethekoenig · 2023-09-18T18:29:47Z

I really like the idea of decoupling the LLM specified file change format and our internal file change object. I wonder if instead of calling them ConcreteChange and AbstractChange we might want to call them something like LlmResponseFormat (or LlmChangeParser?) and FileChange. As I understand it the responsibilities/differences of the two objects are:

ConcreteChange or LlmResponseFormat

Provide a prompt which describes the format to specify a change in. Like is currently done here.
Provide a function which parses responses which have been received from an Llm prompted with the LlmResponseFormat's prompt block and returns a collection of FileChange objects.
We may make many of these and experiment with different prompts for different Llms and evaluate things like how consistently Llms can respond in the correct format, how many tokens they use both in prompt and response and if they affect response quality.
(Maybe you imagine the ConcreteChange isn't the LlmResponseFormat which is a third thing but is the actual response describing a file change received from the Llm?)

AbstractChange or FileChange

Represents a change to a file precisely with all the information you'd need to make the change e.g. path, line #, text to insert.
Has a function to pretty print it for a user to accept/reject.
Has a function which actually makes the edit.
Once we settle on a design for this object it's unlikely to change much compared to LlmResponseFormat where there's a lot of experimentation to do.

Let me know it my summary and understanding aren't quite right but I think the dichotomy is more like Raw/Parsed than Concrete/Abstract.

PCSwingle · 2023-09-18T20:50:23Z

@jakethekoenig completely agree! Last week while I was making these changes I realized that concrete change wasn't really needed, and mapped out a completely different architecture that I really like with @biobootloader. I'll keep this issue as-is, but in #94 I lay out the new structure we agreed upon; it pretty much turns the AbstractChange into FileEdit (although FileChange would be a good name too) and removes the ConcreteChange completely.

waydegg · 2023-10-04T22:33:41Z

Can we close this issue?

PCSwingle self-assigned this Sep 12, 2023

PCSwingle mentioned this issue Sep 13, 2023

Refactor changes #94

Merged

5 tasks

biobootloader closed this as completed Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor CodeChange code #91

Refactor CodeChange code #91

PCSwingle commented Sep 12, 2023

PCSwingle commented Sep 12, 2023 •

edited

Loading

jakethekoenig commented Sep 18, 2023

PCSwingle commented Sep 18, 2023

waydegg commented Oct 4, 2023

Refactor CodeChange code #91

Refactor CodeChange code #91

Comments

PCSwingle commented Sep 12, 2023

Overview

Plan

Possible alternative?

PCSwingle commented Sep 12, 2023 • edited Loading

jakethekoenig commented Sep 18, 2023

PCSwingle commented Sep 18, 2023

waydegg commented Oct 4, 2023

PCSwingle commented Sep 12, 2023 •

edited

Loading