-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel renames #167
Comments
Need to implement proper request throttling and retry logic when doing this |
Related: This seems to be the section of code for implementing better throttling/retry logic (at least for the openai plugin): |
Resume-ability would also be a good thing to consider. |
Some of the discussion in the following issue could tangentially relate to resumability (specifically if a consistent 'map' of renames was created, perhaps that could also show which sections of the code hadn't yet been processed): |
I'm trying to process a pretty huge file and just ran into this:
I'm going to see about improving the rate limiting here: // /src/plugins/openai/openai-rename.ts
+import Bottleneck from "bottleneck/light";
+// Math.floor(10_000 / 24) requests/hour
+const limiter = new Bottleneck({
+ "reservoir": Math.floor(10_000 / 24),
+ "reservoirRefreshAmount": Math.floor(10_000 / 24),
+ "reservoirRefreshInterval": 3_600_000
+});
export function openaiRename({
apiKey,
baseURL,
model,
contextWindowSize
}: {
apiKey: string;
baseURL: string;
model: string;
contextWindowSize: number;
}) {
const client = new OpenAI({ apiKey, baseURL });
+ const wrapped = limiter.wrap(async (code: string): Promise<string> => {
return await visitAllIdentifiers(
code,
async (name, surroundingCode) => {
verbose.log(`Renaming ${name}`);
verbose.log("Context: ", surroundingCode);
const response = await client.chat.completions.create(
toRenamePrompt(name, surroundingCode, model)
);
const result = response.choices[0].message?.content;
if (!result) {
throw new Error("Failed to rename", { cause: response });
}
const renamed = JSON.parse(result).newName;
verbose.log(`Renamed to ${renamed}`);
return renamed;
},
contextWindowSize,
showPercentage
);
+ });
+ return wrapped();
} |
Context from other thread:
|
Could we have a PR with the majority of the fixes, even if it's not production ready? I paused my work as I lost track of the tasks and became discouraged by the errors, compounded by a sluggish machine. I still want to deobfuscate some Chrome extensions to modify them or understand their functions better. |
My thought is, that it should speed up the process a lot if the renames were done in parallel. Especially if the user has enough OpenAI quota, it could be much faster to process large files by parallelising the work.
Local inference should also be able to be run in parallel, if the user has good enough GPU at hand.
One big problems is, that I've gotten the best results when applying renames from the bottom up – so say we have:
It seems that running the rename in order
a -> b -> c
yields much better results than runningc -> b -> a
.But if we'd have multiple same-level identifiers like:
At least in theory it would be possible to run
a
first and[b, c, d]
in parallel to get feasible results.In the best case scenario there would be a second LLM step to check that all variables still make sense after the parallel run has finished.
The text was updated successfully, but these errors were encountered: