You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add CLI option to download files (#34)
* Option to check if file has been uploaded in the past before uploading (#33)
The check is done based on filename, file purpose and file size
* Add fine-tuning hparams directly into the fine-tunes CLI (#35)
* update fine_tunes cli use_packing argument (#38)
* A file verification and remediation tool.
It applies the following validations:
- prints the number of examples, and warns if it's lower than 100
- ensures prompt and completion columns are present
- optionally removes any additional columns
- ensures all completions are non-empty
- infers which type of fine-tuning the data is most likely in (classification, conditional generation and open-ended generation)
- optionally removes duplicate rows
- infers the existence of a common suffix, and if there is none, suggests one for classification and conditional generation
- optionally prepends a space to each completion, to make tokenization better
- optionally splits into training and validation set for the classification use case
- optionally ensures there's an ending string for all completions
- optionally lowercases completions or prompts if more than a 1/3 of alphanumeric characters are upper case
It interactively asks the user to accept or reject recommendations. If the user is happy, then it saves the modified output file as a jsonl, which is ready for being used in fine-tuning with the printed command.
* Completion: remove from kwargs before passing to EngineAPI (#37)
* Version bump before pushing to external
Co-authored-by: Todor Markov <todor.m.markov@gmail.com>
Co-authored-by: Boris Power <81998504+BorisPower@users.noreply.github.com>
Co-authored-by: Dave Cummings <dave@openai.com>
help="JSONL, JSON, CSV, TSV, TXT or XLSX file containing prompt-completion examples to be analyzed."
460
+
"This should be the local file path.",
461
+
)
462
+
sub.set_defaults(func=FineTune.prepare_data)
463
+
464
+
465
+
defapi_register(parser):
334
466
# Engine management
335
467
subparsers=parser.add_subparsers(help="All API subcommands")
336
468
@@ -544,6 +676,12 @@ def help(args):
544
676
"be the ID of a file uploaded through the OpenAI API (e.g. file-abcde12345) "
545
677
"or a local file path.",
546
678
)
679
+
sub.add_argument(
680
+
"--no_check_if_files_exist",
681
+
dest="check_if_files_exist",
682
+
action="store_false",
683
+
help="If this argument is set and training_file or validation_file are file paths, immediately upload them. If this argument is not set, check if they may be duplicates of already uploaded files before uploading, based on file name and file size.",
684
+
)
547
685
sub.add_argument(
548
686
"-m",
549
687
"--model",
@@ -554,13 +692,84 @@ def help(args):
554
692
action="store_true",
555
693
help="If set, returns immediately after creating the job. Otherwise, waits for the job to complete.",
0 commit comments