-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
describegpt: add --prompt-file option (resolves #1085) #1120
Conversation
src/cmd/describegpt.rs
Outdated
// Get max_tokens from prompt file if --prompt-file is used | ||
let max_tokens = match args.flag_prompt_file.clone() { | ||
Some(prompt_file) => { | ||
let prompt_file = get_prompt_file(args)?; | ||
prompt_file.tokens | ||
} | ||
None => args.flag_max_tokens, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jqnatividad Since --max-tokens
has a default of 50, how can we check if the option itself is being used?
I'm trying to make it so that if --max-tokens
is explicitly set then it overrides the prompt file tokens
value like you mentioned in the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rzmk , it's a bit hacky, but this is what I'd do:
pub fn run(argv: &[&str]) -> CliResult<()> {
let args: Args = util::get_args(USAGE, argv)?;
// simulate invoking describegpt with just a stdin input, and no options set, so we get the defaults
let argv_defaults = &["describegpt", "-"];
let args_defaults: Args = util::get_args(USAGE, argv_defaults)?;
so args_defaults.flag_max_tokens
will contain the default value specified in the USAGE text, which you can use later in the code to see if it was changed. This approach gives us the flexibility to insulate ourselves from future changes to the USAGE text as defaults, additional args/options are added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jqnatividad Getting an error this way:
$ ./target/debug/qsv describegpt NYC_311_SR_2010-2020-sample-1M.csv --tags --max-tokens 100
Invalid arguments.
Usage:
qsv describegpt [options] [<input>]
qsv describegpt --help
Also another two ways that may work are:
- Looping through
argv
for--max-tokens
and getting the subsequent value. - Making
--max-tokens
optional so we can useis_some()
.
Both methods don't make args_default
though which could be useful in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake @rzmk , the first element of argv is the name of the binary:
pub fn run(argv: &[&str]) -> CliResult<()> {
let args: Args = util::get_args(USAGE, argv)?;
// simulate invoking describegpt with just a stdin input, and no options set, so we get the defaults
let argv_defaults = &[argv[0], "describegpt", "-"];
let args_defaults: Args = util::get_args(USAGE, argv_defaults)?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still getting the same error @jqnatividad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhmmm....
It's working for me:
$ cargo t describegpt -F lite
...
$ target/debug/qsvlite describegpt C:\Users\joeln\Downloads\NYC_311_SR_2010-2020-sample-1M.csv --tags --max-tokens 100
Generating stats from \\?\C:\Users\joeln\Downloads\NYC_311_SR_2010-2020-sample-1M.csv using qsv stats --everything...
Generating frequency from \\?\C:\Users\joeln\Downloads\NYC_311_SR_2010-2020-sample-1M.csv using qsv frequency...
Interacting with OpenAI API...
Generating tags from OpenAI API...
Received tags completion.
uniquekey, createddate, closeddate, agency, agencyname, complainttype, descriptor, locationtype, incidentzip, incidentaddress, streetname, crossstreet1, crossstreet2, intersectionstreet1, intersectionstreet2, addressType, city, landmark, facilitytype, status, duedate, resolutiondescription, resolutionactionupdateddate, communityboard, bbl, borough, xcoordinatestateplane, ycoordinatestateplane, opendatachanneltype, parkfacilityname, park
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow it works now for me?! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jqnatividad What if a user specifies --max-tokens 50
then how can I differentiate between default? This is mainly what I'm considering because a user may have 7000
in their prompt file for tokens
but then also set --max-tokens 50
. How can I identify if --max-tokens
is explicitly set so I can prioritize --max-tokens
over the prompt file value if args.flag_max_tokens == args_defaults.flag_max_tokens
?
Edit: I'm trying a closure right now.
Also adds the arg_is_some closure to check if user specified an arg.
@@ -415,6 +469,8 @@ fn run_inference_options( | |||
|
|||
pub fn run(argv: &[&str]) -> CliResult<()> { | |||
let args: Args = util::get_args(USAGE, argv)?; | |||
// Closure to check if the user gives an argument | |||
let arg_is_some = |arg: &str| -> bool { argv.contains(&arg) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a closure that can resolve the prioritization issue.
@@ -0,0 +1,87 @@ | |||
# `describegpt` command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rzmk ❤️how thorough describegpt's documentation is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Let's collab on it once its merged @rzmk
🗺 Overview
--prompt-file
option todescribegpt
& a doc file atdocs/Describegpt.md
.src/cmd/describegpt.rs
, refactorsget_dictionary_prompt
,get_description_prompt
, &get_tags_prompt
to functionsget_prompt_file
&get_prompt
.Resolves #1085.