Skip to content

Commit 0925ced

Browse files
committed
add chatgpt robust
1 parent 6e5aa1d commit 0925ced

35 files changed

+3144
-0
lines changed

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,11 @@
22

33
Latest research in robust machine learning, including adversarial/backdoor attack and defense, out-of-distribution (OOD) generalization, and safe transfer learning.
44

5+
Hosted projects:
6+
7+
- ChatGPT robustness: [code](./chatgpt-robust/) | Paper: [On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective](xxx)
8+
- Stay tuned for more upcoming projects!
9+
510
## Contributing
611

712
This project welcomes contributions and suggestions. Most contributions require you to agree to a

chatgpt-robust/README.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Robustness evaluation of ChatGPT
2+
3+
This repo contains the source code in the paper [On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective]().
4+
5+
![](./fig-robchat.png)
6+
7+
This project is to evaluate the robustness of ChatGPT as well as some foundation language models.
8+
You can find the results [here](#results).
9+
10+
## Pre-requisites
11+
12+
First, clone and get into the repo:
13+
```
14+
git clone https://github.com/microsoft/robustlearn.git
15+
cd robustlearn/chatgpt-robust
16+
```
17+
18+
Then, install the following important depencencies by:
19+
- `pip install transformers pandas nltk jieba`
20+
- Install ChatGPT python API from here: https://github.com/mmabrouk/chatgpt-wrapper
21+
- Note that the Python API should be run in local machine since it requires to pop up a brower to login.
22+
23+
You can also create an conda virtual environment by running `conda env create -f environment.yml`.
24+
25+
## Usage
26+
27+
All things can be used by running `main.py`:
28+
29+
For classification tasks:
30+
- Use Huggingface: `python main.py --dataset advglue --task sst2 --service hug --model xxx`
31+
- Use GPT API: `python main.py --dataset advglue --task sst2 --service gpt --model text-davinci-003`
32+
- Use ChatGPT: `python main.py --dataset advglue --task sst2 --service chat`
33+
- Note: ChatGPT should be used in PC, not on server
34+
35+
For translation tasks:
36+
- Use Huggingface: `python main.py --dataset advglue-t --task translation_en_to_zh --service hug --model xx`
37+
38+
39+
## Results
40+
41+
Note that you will not get the final results by simply running the codes, since the outputs of generative models are not stable. We need some manual process. Bad cases of AdvGLUE and Flipkart are pvovided in [this folder](./result/chatgpt_results/).
42+
43+
Here is the summary of the results.
44+
45+
### Adversarial robustness for classification.
46+
47+
The metric is attack success rate (ASR).
48+
49+
| Model | SST-2 | QQP | MNLI | QNLI | RTE | ANLI |
50+
|--------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
51+
| Random | 50.0 | 50.0 | 66.7 | 50.0 | 50.0 | 66.7 |
52+
| DeBERTa-L (435 M) | 66.9 | 39.7 | 64.5 | 46.6 | 60.5 | 69.3 |
53+
| BART-L (407 M) | 56.1 | 62.8 | 58.7 | 52.0 | 56.8 | 57.7 |
54+
| GPT-J | 48.7 | 59.0 | 73.6 | 50.0 | 56.8 | 66.5 |
55+
| T5 (11 B) | 40.5 | 59.0 | 48.8 | 49.7 | 56.8 | 68.6 |
56+
| T0 (11 B) | **36.5** | 60.3 | 72.7 | 49.7 | 56.8 | 77.2 |
57+
| NEOX-20B | 52.7 | 56.4 | 59.5 | 54.0 | 48.1 | 70.0 |
58+
| OPT (66 B) | 47.6 | 53.9 | 60.3 | 52.7 | 58.0 | 58.3 |
59+
| BLOOM (176 B) | 48.7 | 59.0 | 73.6 | 49.7 | 56.8 | 66.5 |
60+
| text-davinci-002 (175 B) | 46.0 | 28.2 | 54.6 | 45.3 | 35.8 | 68.8 |
61+
| text-davinci-003 (175 B) | 44.6 | 55.1 | 44.6 | 38.5 | 34.6 | 62.9 |
62+
| ChatGPT (175 B) | 39.9 | **18.0** | **32.2** | **34.5** | **24.7** | **55.3** |
63+
64+
65+
### Adversarial robustness for machine translation
66+
67+
The metrics are BLEU, GLEU, and METEOR.
68+
69+
| Translation | BLEU | GLEU | METEOR |
70+
|-----------------------------|----------|-----------|-----------|
71+
| Helsinki-NLP/opus-mt-en-zh | 18.11 | 26.78 | 46.38 |
72+
| liam168/trans-opus-mt-en-zh | 15.23 | 24.89 | 45.02 |
73+
| text-davinci-002 | 24.97 | 36.3 | 59.28 |
74+
| text-davinci-003 | **30.6** | **40.01** | **61.88** |
75+
| ChatGPT | 26.27 | 37.29 | 58.95 |
76+
77+
### Out-of-distribution robustness
78+
79+
The metric is F1 score.
80+
81+
| Model | Flipkart | DDXPlus |
82+
|--------------------------|----------|---------|
83+
| Random | 20 | 4 |
84+
| DeBERTa-L (435 M) | 60.6 | 4.5 |
85+
| BART-L (407 M) | 57.8 | 5.3 |
86+
| GPT-J | 28 | 2.4 |
87+
| T5 (11 B) | 58.8 | 6.3 |
88+
| T0 (11 B) | 58.3 | 8.4 |
89+
| NEOX-20B | 39.4 | 12.3 |
90+
| OPT (66 B) | 44.5 | 0.3 |
91+
| BLOOM (176 B) | 28 | 0.1 |
92+
| text-davinci-002 (175 B) | 57.5 | 18.9 |
93+
| text-davinci-003 (175 B) | 57.3 | 19.6 |
94+
| ChatGPT (175 B) | 60.6 | 20.2 |
95+
96+
## Citation
97+
98+
To do.

chatgpt-robust/config.py

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
LABEL_SET = {
2+
'sst2': ['positive', 'negative'],
3+
'mnli': ['entailment', 'neutral', 'contradiction'],
4+
'anli': ['entailment', 'neutral', 'contradiction'],
5+
'qqp': ['equivalent', 'not_equivalent'],
6+
'qnli': ['entailment', 'not_entailment'],
7+
'mnli-mm': ['entailment', 'neutral', 'contradiction'],
8+
'rte': ['entailment', 'not_entailment'],
9+
'ddxplus': ['spontaneous pneumothorax', 'cluster headache', 'boerhaave', 'spontaneous rib fracture', 'gerd', 'hiv (initial infection)', 'anemia', 'viral pharyngitis', 'inguinal hernia', 'myasthenia gravis', 'whooping cough', 'anaphylaxis', 'epiglottitis', 'guillain-barré syndrome', 'acute laryngitis', 'croup', 'psvt', 'atrial fibrillation', 'bronchiectasis', 'allergic sinusitis', 'chagas', 'scombroid food poisoning', 'myocarditis', 'larygospasm', 'acute dystonic reactions', 'localized edema', 'sle', 'tuberculosis', 'unstable angina', 'stable angina', 'ebola', 'acute otitis media', 'panic attack', 'bronchospasm / acute asthma exacerbation', 'bronchitis', 'acute copd exacerbation / infection', 'pulmonary embolism', 'urti', 'influenza', 'pneumonia', 'acute rhinosinusitis', 'chronic rhinosinusitis', 'bronchiolitis', 'pulmonary neoplasm', 'possible nstemi / stemi', 'sarcoidosis', 'pancreatic neoplasm', 'acute pulmonary edema', 'pericarditis', 'cannot decide'],
10+
'flipkart': ['positive', 'negative', 'neutral'],
11+
12+
}
13+
14+
MODEL_SET = {
15+
'hug_zs': [ # zero-shot classification using fine-tuned models
16+
'cross-encoder/nli-deberta-v3-large',
17+
'sentence-transformers/nli-roberta-large',
18+
'facebook/bart-large-mnli'
19+
],
20+
'hug_gen': [ # generative big models using text-generation
21+
'google/flan-t5-large',
22+
'facebook/opt-66b',
23+
'bigscience/bloomz-7b1',
24+
'bigscience/T0pp',
25+
'bigscience/bloom',
26+
'EleutherAI/gpt-j-6B',
27+
'EleutherAI/gpt-neox-20b',
28+
'BAAI/glm-10b'
29+
30+
],
31+
'gpt': [
32+
# 'text-ada-001',
33+
'text-davinci-002',
34+
'text-davinci-003',
35+
],
36+
'chat': ['bert-large-uncased']
37+
}
38+
39+
PROMPT_SET = {
40+
'sst2': [
41+
'Is the following sentence positive or negative? Answer me with "positive" or "negative", just one word. ',
42+
'Please classify the following sentence into either positive or negative. Answer me with "positive" or "negative", just one word. ',
43+
],
44+
'qqp': [
45+
'Are the following two questions equivalent or not? Answer me with "equivalent" or "not_equivalent". ',
46+
],
47+
'mnli': [
48+
'Are the following two sentences entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". ',
49+
],
50+
'anli': [
51+
'Are the following paragraph entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". The answer should be a single word. The answer is: ',
52+
],
53+
'qnli': [
54+
'Are the following question and sentence entailment or not_entailment? Answer me with "entailment" or "not_entailment". ',
55+
],
56+
'mnli-mm': [
57+
'Are the following two sentences entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". ',
58+
],
59+
'flipkart': [
60+
'Is the following sentence positive, neutral, or negative? Answer me with "positive", "neutral", or "negative", just one word. ',
61+
],
62+
'rte': [
63+
'Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment". ',
64+
],
65+
'ddxplus': [
66+
"Imagine you are an intern doctor. Based on the previous dialogue, what is the diagnosis? Select one answer among the following lists: ['spontaneous pneumothorax', 'cluster headache', 'boerhaave', 'spontaneous rib fracture', 'gerd', 'hiv (initial infection)', 'anemia', 'viral pharyngitis', 'inguinal hernia', 'myasthenia gravis', 'whooping cough', 'anaphylaxis', 'epiglottitis', 'guillain-barré syndrome', 'acute laryngitis', 'croup', 'psvt', 'atrial fibrillation', 'bronchiectasis', 'allergic sinusitis', 'chagas', 'scombroid food poisoning', 'myocarditis', 'larygospasm', 'acute dystonic reactions', 'localized edema', 'sle', 'tuberculosis', 'unstable angina', 'stable angina', 'ebola', 'acute otitis media', 'panic attack', 'bronchospasm / acute asthma exacerbation', 'bronchitis', 'acute copd exacerbation / infection', 'pulmonary embolism', 'urti', 'influenza', 'pneumonia', 'acute rhinosinusitis', 'chronic rhinosinusitis', 'bronchiolitis', 'pulmonary neoplasm', 'possible nstemi / stemi', 'sarcoidosis', 'pancreatic neoplasm', 'acute pulmonary edema', 'pericarditis', 'cannot decide']. The answer should be a single word. The answer is: "
67+
],
68+
'translation_en_to_zh': [
69+
'Translate the following sentence from Engilish to Chinese. '
70+
],
71+
'translation_zh_to_en': [
72+
'Translate the following sentence from Chinese to English. '
73+
]
74+
}
75+
76+
LABEL_TO_ID = {
77+
'sst2': {'negative': 0, 'positive': 1, 'neutral': 2},
78+
'mnli': {'entailment': 0, 'neutral': 1, 'contradiction': 2},
79+
'qqp': {'equivalent': 1, 'not_equivalent': 0},
80+
'qnli': {'entailment': 0, 'not_entailment': 1},
81+
'flipkart': {'negative': 0, 'positive': 1, 'neutral': 2},
82+
# 'mnli-mm': {'entailment': 0, 'neutral': 1, 'contradiction': 2},
83+
'rte': {'entailment': 0, 'not_entailment': 1},
84+
'ddxplus': {'spontaneous pneumothorax': 0, 'cluster headache': 1, 'boerhaave': 2, 'spontaneous rib fracture': 3, 'gerd': 4, 'hiv (initial infection)': 5, 'anemia': 6, 'viral pharyngitis': 7, 'inguinal hernia': 8, 'myasthenia gravis': 9, 'whooping cough': 10, 'anaphylaxis': 11, 'epiglottitis': 12, 'guillain-barré syndrome': 13, 'acute laryngitis': 14, 'croup': 15, 'psvt': 16, 'atrial fibrillation': 17, 'bronchiectasis': 18, 'allergic sinusitis': 19, 'chagas': 20, 'scombroid food poisoning': 21, 'myocarditis': 22, 'larygospasm': 23, 'acute dystonic reactions': 24, 'localized edema': 25, 'sle': 26, 'tuberculosis': 27, 'unstable angina': 28, 'stable angina': 29, 'ebola': 30, 'acute otitis media': 31, 'panic attack': 32, 'bronchospasm / acute asthma exacerbation': 33, 'bronchitis': 34, 'acute copd exacerbation / infection': 35, 'pulmonary embolism': 36, 'urti': 37, 'influenza': 38, 'pneumonia': 39, 'acute rhinosinusitis': 40, 'chronic rhinosinusitis': 41, 'bronchiolitis': 42, 'pulmonary neoplasm': 43, 'possible nstemi / stemi': 44, 'sarcoidosis': 45, 'pancreatic neoplasm': 46, 'acute pulmonary edema': 47, 'pericarditis': 48, 'cannot decide': 49},
85+
'anli': {'entailment': 0, 'neutral': 1, 'contradiction': 2},
86+
87+
}
88+
89+
ID_TO_LABEL = {
90+
'sst2': {0: 'negative', 1: 'positive', 2: 'neutral'},
91+
'mnli': {0: 'entailment', 1: 'neutral', 2: 'contradiction'},
92+
'qqp': {1: 'equivalent', 0: 'not_equivalent'},
93+
'qnli': {0: 'entailment', 1: 'not_entailment'},
94+
'flipkart': {0: 'negative', 1: 'positive', 2: 'neutral'},
95+
# 'mnli-mm': {0: 'entailment', 1: 'neutral', 2: 'contradiction'},
96+
'rte': {0: 'entailment', 1: 'not_entailment'},
97+
'ddxplus': {0: 'spontaneous pneumothorax', 1: 'cluster headache', 2: 'boerhaave', 3: 'spontaneous rib fracture', 4: 'gerd', 5: 'hiv (initial infection)', 6: 'anemia', 7: 'viral pharyngitis', 8: 'inguinal hernia', 9: 'myasthenia gravis', 10: 'whooping cough', 11: 'anaphylaxis', 12: 'epiglottitis', 13: 'guillain-barré syndrome', 14: 'acute laryngitis', 15: 'croup', 16: 'psvt', 17: 'atrial fibrillation', 18: 'bronchiectasis', 19: 'allergic sinusitis', 20: 'chagas', 21: 'scombroid food poisoning', 22: 'myocarditis', 23: 'larygospasm', 24: 'acute dystonic reactions', 25: 'localized edema', 26: 'sle', 27: 'tuberculosis', 28: 'unstable angina', 29: 'stable angina', 30: 'ebola', 31: 'acute otitis media', 32: 'panic attack', 33: 'bronchospasm / acute asthma exacerbation', 34: 'bronchitis', 35: 'acute copd exacerbation / infection', 36: 'pulmonary embolism', 37: 'urti', 38: 'influenza', 39: 'pneumonia', 40: 'acute rhinosinusitis', 41: 'chronic rhinosinusitis', 42: 'bronchiolitis', 43: 'pulmonary neoplasm', 44: 'possible nstemi / stemi', 45: 'sarcoidosis', 46: 'pancreatic neoplasm', 47: 'acute pulmonary edema', 48: 'pericarditis', 49: 'cannot decide'},
98+
}
99+
100+
DATA_PATH = {
101+
'advglue': './data/advglue/dev.json',
102+
'flipkart': './data/flipkart/flipkart_review.csv',
103+
'ddxplus': './data/ddxplus/ddxplus.csv',
104+
'anli': './data/anli/test.jsonl',
105+
'advglue-t': './data/advglue-t/translation.json',
106+
}
107+
108+
109+
MODEL_SET_TRANS = {
110+
'gpt': [
111+
# 'text-ada-001',
112+
'text-davinci-002',
113+
'Helsinki-NLP/opus-mt-en-zh',
114+
'liam168/trans-opus-mt-en-zh',
115+
'text-davinci-003',
116+
],
117+
'chatgpt': [
118+
'chatgpt'
119+
]
120+
}
121+
122+
OPENAI_KEYS = {
123+
'api_key': "xxxxxxx",
124+
'api_token': "xxxxxxx"
125+
}
126+
127+
PROMPT_SET2 = {
128+
'sst2': [
129+
'Is the following sentence positive or negative? Answer me with "positive" or "negative", just one word. ',
130+
'Please classify the following sentence into either positive or negative. If it is positive, reply 1, otherwise, reply 0. Just answer me with a single number. ',
131+
],
132+
'qqp': [
133+
'Are the following two questions equivalent or not? If they are equivalent, answer me with 1, otherwise, answer me with 0. Just answer me with a single number. ',
134+
],
135+
'mnli': [
136+
'Are the following two sentences entailment, neutral or contradiction? Answer me with 0 if they are entailment, answer me with 2 if they are contradiction, otherwise answer me with 1. ',
137+
],
138+
'anli': [
139+
'Are the following two sentences entailment, neutral or contradiction? Answer me with 1 if they are entailment, answer me with 0 if they are contradiction, otherwise answer me with 2. ',
140+
],
141+
'qnli': [
142+
'Are the following question and sentence entailment or not entailment? Answer me with 0 if they are entailment, otherwise 1. ',
143+
],
144+
'mnli-mm': [
145+
'Are the following two sentences entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". ',
146+
],
147+
'flipkart': [
148+
'Is the following sentence positive, neutral, or negative? Answer me with "positive", "neutral", or "negative", just one word. ',
149+
],
150+
'rte': [
151+
'Are the following two sentences entailment or not? Answer me with 0 if they are entailment, otherwise answer me with 1. ',
152+
],
153+
'ddxplus': [
154+
"Imagine you are an intern doctor. Based on the previous dialogue, what is the diagnosis? Select one answer among the following lists: ['spontaneous pneumothorax', 'cluster headache', 'boerhaave', 'spontaneous rib fracture', 'gerd', 'hiv (initial infection)', 'anemia', 'viral pharyngitis', 'inguinal hernia', 'myasthenia gravis', 'whooping cough', 'anaphylaxis', 'epiglottitis', 'guillain-barré syndrome', 'acute laryngitis', 'croup', 'psvt', 'atrial fibrillation', 'bronchiectasis', 'allergic sinusitis', 'chagas', 'scombroid food poisoning', 'myocarditis', 'larygospasm', 'acute dystonic reactions', 'localized edema', 'sle', 'tuberculosis', 'unstable angina', 'stable angina', 'ebola', 'acute otitis media', 'panic attack', 'bronchospasm / acute asthma exacerbation', 'bronchitis', 'acute copd exacerbation / infection', 'pulmonary embolism', 'urti', 'influenza', 'pneumonia', 'acute rhinosinusitis', 'chronic rhinosinusitis', 'bronchiolitis', 'pulmonary neoplasm', 'possible nstemi / stemi', 'sarcoidosis', 'pancreatic neoplasm', 'acute pulmonary edema', 'pericarditis', 'cannot decide']. The answer should be a single word. The answer is: "
155+
],
156+
'translation_en_to_zh': [
157+
'Translate the following sentence from Engilish to Chinese. '
158+
],
159+
'translation_zh_to_en': [
160+
'Translate the following sentence from Chinese to English. '
161+
]
162+
}
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# AdvGLUE-T
2+
3+
30 sentences selected from AdvGLUE dev set.
4+
5+
The "source" is the source adversarial language and the "target" is the ground-truth translation.

0 commit comments

Comments
 (0)