-
Notifications
You must be signed in to change notification settings - Fork 11
[example] add qa recipe #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
west/dataset/dataset.py
Outdated
| try: | ||
| x['txt'] = x['txt'].decode('utf8') | ||
| x['wav'] = io.BytesIO(x['wav']) | ||
| if "messages" in x.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这块我们不加了,
对于 SFT 的数据,我们系统上设计只支持 jsonl 的
| if 'messages' in item: # OpenAI role-content based SFT data | ||
| # OpenAI role-content based SFT data | ||
| # At least one pair of "user" and "assistant" | ||
| if 'messages' in item and len(item["messages"]) >= 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个直接在下面 assert len(item['messages']) >= 2
| @@ -0,0 +1,82 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文件命名上我们都是用下划线形式compute_acc_of_contain.py
examples/belle_1.4M_qa/run.sh
Outdated
| python west/bin/decode.py \ | ||
| --data_path $data/chinese_qa.jsonl \ | ||
| --model_dir $mdir \ | ||
| --result_path $mdir/result.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
现在 main 分支中输出该 jsonl 了
Uh oh!
There was an error while loading. Please reload this page.