-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gen #775
base: main
Are you sure you want to change the base?
gen #775
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
not stale |
Copilot
AI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 6 out of 12 changed files in this pull request and generated 5 suggestions.
Files not reviewed (6)
- src/autotrain/app/templates/index.html: Language not supported
- src/autotrain/cli/autotrain.py: Evaluated as low risk
- src/autotrain/datagen/clients.py: Evaluated as low risk
- src/autotrain/app/static/scripts/listeners.js: Evaluated as low risk
- src/autotrain/trainers/text_classification/params.py: Evaluated as low risk
- src/autotrain/trainers/text_classification/main.py: Evaluated as low risk
return | ||
cmd = f"autotrain --config {params.training_config}" | ||
logger.info(f"Running AutoTrain: {cmd}") | ||
cmd = [str(c) for c in cmd] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command should be split into a list of arguments, not characters. Use cmd = cmd.split()
instead.
cmd = [str(c) for c in cmd] | |
cmd = cmd.split() |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
path = os.path.join(output_dir, "gen_params.json") | ||
# save formatted json | ||
with open(path, "w", encoding="utf-8") as f: | ||
f.write(self.model_dump_json(indent=4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method 'model_dump_json' does not exist in Pydantic's BaseModel. It should be 'self.json(indent=4)'.
f.write(self.model_dump_json(indent=4)) | |
f.write(self.json(indent=4)) |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
src/autotrain/datagen/text.py
Outdated
) | ||
|
||
if message is None: | ||
logger.warning("Failed to generate data. Retrying...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code retries indefinitely if the message is None, which could lead to an infinite loop. Consider adding a maximum retry limit.
logger.warning("Failed to generate data. Retrying...") | |
if message is None and counter < self.params.max_retries: |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
src/autotrain/datagen/text.py
Outdated
TEXT_CLASSIFICATION_DATA_PROMPT = """ | ||
The dataset for text classification is in JSON format. | ||
Each line should be a JSON object with the following keys: text and target. | ||
Make sure each text sample has atleast {min_words} words. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word 'atleast' should be 'at least'.
Make sure each text sample has atleast {min_words} words. | |
Make sure each text sample has at least {min_words} words. |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
src/autotrain/datagen/text.py
Outdated
SEQ2SEQ_DATA_PROMPT = """ | ||
The dataset for sequence-to-sequence is in JSON format. | ||
Each line should be a JSON object with the following keys: text and target. | ||
Make sure each text sample has atleast {min_words} words. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word 'atleast' should be 'at least'.
Make sure each text sample has atleast {min_words} words. | |
Make sure each text sample has at least {min_words} words. |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
No description provided.