microsoft
diff --git a/‎README.md
Lines changed: 5 additions & 0 deletions b/‎README.md
Lines changed: 5 additions & 0 deletions
diff --git a/‎chatgpt-robust/README.md
Lines changed: 98 additions & 0 deletions b/‎chatgpt-robust/README.md
Lines changed: 98 additions & 0 deletions
diff --git a/‎chatgpt-robust/config.py
Lines changed: 162 additions & 0 deletions b/‎chatgpt-robust/config.py
Lines changed: 162 additions & 0 deletions
diff --git a/‎chatgpt-robust/data/advglue-t/README.md
Lines changed: 5 additions & 0 deletions b/‎chatgpt-robust/data/advglue-t/README.md
Lines changed: 5 additions & 0 deletions
@@ -2,6 +2,11 @@
 
 Latest research in robust machine learning, including adversarial/backdoor attack and defense, out-of-distribution (OOD) generalization, and safe transfer learning.
 
+Hosted projects:
+
+- ChatGPT robustness: [code](./chatgpt-robust/) | Paper: [On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective](xxx)
+- Stay tuned for more upcoming projects!
+
 ## Contributing
 
 This project welcomes contributions and suggestions.  Most contributions require you to agree to a
 
@@ -0,0 +1,98 @@
+# Robustness evaluation of ChatGPT
+
+This repo contains the source code in the paper [On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective]().
+
+![](./fig-robchat.png)
+
+This project is to evaluate the robustness of ChatGPT as well as some foundation language models.
+You can find the results [here](#results).
+
+## Pre-requisites
+
+First, clone and get into the repo:
+```
+git clone https://github.com/microsoft/robustlearn.git
+cd robustlearn/chatgpt-robust
+```
+
+Then, install the following important depencencies by:
+- `pip install transformers pandas nltk jieba`
+- Install ChatGPT python API from here: https://github.com/mmabrouk/chatgpt-wrapper
+  - Note that the Python API should be run in local machine since it requires to pop up a brower to login.
+
+You can also create an conda virtual environment by running `conda env create -f environment.yml`.
+
+## Usage
+
+All things can be used by running `main.py`:
+
+For classification tasks:
+- Use Huggingface: `python main.py --dataset advglue --task sst2 --service hug --model xxx`
+- Use GPT API: `python main.py --dataset advglue --task sst2 --service gpt --model text-davinci-003`
+- Use ChatGPT: `python main.py --dataset advglue --task sst2 --service chat`
+  - Note: ChatGPT should be used in PC, not on server
+
+For translation tasks:
+- Use Huggingface: `python main.py --dataset advglue-t --task translation_en_to_zh --service hug --model xx`
+
+
+## Results
+
+Note that you will not get the final results by simply running the codes, since the outputs of generative models are not stable. We need some manual process. Bad cases of AdvGLUE and Flipkart are pvovided in [this folder](./result/chatgpt_results/).
+
+Here is the summary of the results.
+
+### Adversarial robustness for classification.
+
+The metric is attack success rate (ASR).
+
+| Model                    | SST-2     | QQP       | MNLI      | QNLI      | RTE       | ANLI      |
+|--------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
+| Random                   | 50.0      | 50.0      | 66.7      | 50.0      | 50.0      | 66.7      |
+| DeBERTa-L (435 M)        | 66.9      | 39.7      | 64.5      | 46.6      | 60.5      | 69.3      |
+| BART-L (407 M)           | 56.1      | 62.8      | 58.7      | 52.0      | 56.8      | 57.7      |
+| GPT-J                    | 48.7      | 59.0      | 73.6      | 50.0      | 56.8      | 66.5      |
+| T5 (11 B)                | 40.5      | 59.0      | 48.8      | 49.7      | 56.8      | 68.6      |
+| T0 (11 B)                | **36.5** | 60.3      | 72.7      | 49.7      | 56.8      | 77.2      |
+| NEOX-20B                 | 52.7      | 56.4      | 59.5      | 54.0      | 48.1      | 70.0      |
+| OPT (66 B)               | 47.6      | 53.9      | 60.3      | 52.7      | 58.0      | 58.3      |
+| BLOOM (176 B)            | 48.7      | 59.0      | 73.6      | 49.7      | 56.8      | 66.5      |
+| text-davinci-002 (175 B) | 46.0      | 28.2      | 54.6      | 45.3      | 35.8      | 68.8      |
+| text-davinci-003 (175 B) | 44.6      | 55.1      | 44.6      | 38.5      | 34.6      | 62.9      |
+| ChatGPT (175 B)          | 39.9      | **18.0** | **32.2** | **34.5** | **24.7** | **55.3** |
+
+
+### Adversarial robustness for machine translation
+
+The metrics are BLEU, GLEU, and METEOR.
+
+| Translation                 | BLEU     | GLEU      | METEOR    |
+|-----------------------------|----------|-----------|-----------|
+| Helsinki-NLP/opus-mt-en-zh  | 18.11    | 26.78     | 46.38     |
+| liam168/trans-opus-mt-en-zh | 15.23    | 24.89     | 45.02     |
+| text-davinci-002            | 24.97    | 36.3      | 59.28     |
+| text-davinci-003            | **30.6** | **40.01** | **61.88** |
+| ChatGPT                     | 26.27    | 37.29     | 58.95     |
+
+### Out-of-distribution robustness
+
+The metric is F1 score.
+
+| Model                    | Flipkart | DDXPlus |
+|--------------------------|----------|---------|
+| Random                   | 20       | 4       |
+| DeBERTa-L (435 M)        | 60.6     | 4.5     |
+| BART-L (407 M)           | 57.8     | 5.3     |
+| GPT-J                    | 28       | 2.4     |
+| T5 (11 B)                | 58.8     | 6.3     |
+| T0 (11 B)                | 58.3     | 8.4     |
+| NEOX-20B                 | 39.4     | 12.3    |
+| OPT (66 B)               | 44.5     | 0.3     |
+| BLOOM (176 B)            | 28       | 0.1     |
+| text-davinci-002 (175 B) | 57.5     | 18.9    |
+| text-davinci-003 (175 B) | 57.3     | 19.6    |
+| ChatGPT (175 B)          | 60.6     | 20.2    |
+
+## Citation
+
+To do.
@@ -0,0 +1,162 @@
+LABEL_SET = {
+    'sst2': ['positive', 'negative'],
+    'mnli': ['entailment', 'neutral', 'contradiction'],
+    'anli': ['entailment', 'neutral', 'contradiction'],
+    'qqp': ['equivalent', 'not_equivalent'],
+    'qnli': ['entailment', 'not_entailment'],
+    'mnli-mm': ['entailment', 'neutral', 'contradiction'],
+    'rte': ['entailment', 'not_entailment'],
+    'ddxplus': ['spontaneous pneumothorax', 'cluster headache', 'boerhaave', 'spontaneous rib fracture', 'gerd', 'hiv (initial infection)', 'anemia', 'viral pharyngitis', 'inguinal hernia', 'myasthenia gravis', 'whooping cough', 'anaphylaxis', 'epiglottitis', 'guillain-barré syndrome', 'acute laryngitis', 'croup', 'psvt', 'atrial fibrillation', 'bronchiectasis', 'allergic sinusitis', 'chagas', 'scombroid food poisoning', 'myocarditis', 'larygospasm', 'acute dystonic reactions', 'localized edema', 'sle', 'tuberculosis', 'unstable angina', 'stable angina', 'ebola', 'acute otitis media', 'panic attack', 'bronchospasm / acute asthma exacerbation', 'bronchitis', 'acute copd exacerbation / infection', 'pulmonary embolism', 'urti', 'influenza', 'pneumonia', 'acute rhinosinusitis', 'chronic rhinosinusitis', 'bronchiolitis', 'pulmonary neoplasm', 'possible nstemi / stemi', 'sarcoidosis', 'pancreatic neoplasm', 'acute pulmonary edema', 'pericarditis', 'cannot decide'],
+    'flipkart': ['positive', 'negative', 'neutral'],
+
+}
+
+MODEL_SET = {
+    'hug_zs': [ # zero-shot classification using fine-tuned models
+        'cross-encoder/nli-deberta-v3-large',
+        'sentence-transformers/nli-roberta-large',
+        'facebook/bart-large-mnli'
+    ], 
+    'hug_gen': [   # generative big models using text-generation
+        'google/flan-t5-large',
+        'facebook/opt-66b',
+        'bigscience/bloomz-7b1',
+        'bigscience/T0pp',
+        'bigscience/bloom',
+        'EleutherAI/gpt-j-6B',
+        'EleutherAI/gpt-neox-20b',
+        'BAAI/glm-10b'
+
+    ],
+    'gpt': [
+        # 'text-ada-001',
+        'text-davinci-002',
+        'text-davinci-003',
+    ],
+    'chat': ['bert-large-uncased']
+}
+
+PROMPT_SET = {
+    'sst2': [
+        'Is the following sentence positive or negative? Answer me with "positive" or "negative", just one word. ',
+        'Please classify the following sentence into either positive or negative. Answer me with "positive" or "negative", just one word. ',
+    ],
+    'qqp': [
+        'Are the following two questions equivalent or not? Answer me with "equivalent" or "not_equivalent". ',
+    ],
+    'mnli': [
+        'Are the following two sentences entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". ',
+    ],
+    'anli': [
+        'Are the following paragraph entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". The answer should be a single word. The answer is: ',
+    ],
+    'qnli': [
+        'Are the following question and sentence entailment or not_entailment? Answer me with "entailment" or "not_entailment". ',
+    ],
+    'mnli-mm': [
+        'Are the following two sentences entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". ',
+    ],
+    'flipkart': [
+        'Is the following sentence positive, neutral, or negative? Answer me with "positive", "neutral", or "negative", just one word. ',
+    ],
+    'rte': [
+        'Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment". ',
+    ],
+    'ddxplus': [
+        "Imagine you are an intern doctor. Based on the previous dialogue, what is the diagnosis? Select one answer among the following lists: ['spontaneous pneumothorax', 'cluster headache', 'boerhaave', 'spontaneous rib fracture', 'gerd', 'hiv (initial infection)', 'anemia', 'viral pharyngitis', 'inguinal hernia', 'myasthenia gravis', 'whooping cough', 'anaphylaxis', 'epiglottitis', 'guillain-barré syndrome', 'acute laryngitis', 'croup', 'psvt', 'atrial fibrillation', 'bronchiectasis', 'allergic sinusitis', 'chagas', 'scombroid food poisoning', 'myocarditis', 'larygospasm', 'acute dystonic reactions', 'localized edema', 'sle', 'tuberculosis', 'unstable angina', 'stable angina', 'ebola', 'acute otitis media', 'panic attack', 'bronchospasm / acute asthma exacerbation', 'bronchitis', 'acute copd exacerbation / infection', 'pulmonary embolism', 'urti', 'influenza', 'pneumonia', 'acute rhinosinusitis', 'chronic rhinosinusitis', 'bronchiolitis', 'pulmonary neoplasm', 'possible nstemi / stemi', 'sarcoidosis', 'pancreatic neoplasm', 'acute pulmonary edema', 'pericarditis', 'cannot decide']. The answer should be a single word. The answer is: "
+    ],
+    'translation_en_to_zh': [
+        'Translate the following sentence from Engilish to Chinese. '
+    ],
+    'translation_zh_to_en': [
+        'Translate the following sentence from Chinese to English. '
+    ]
+}
+
+LABEL_TO_ID = {
+    'sst2': {'negative': 0, 'positive': 1, 'neutral': 2},
+    'mnli': {'entailment': 0, 'neutral': 1, 'contradiction': 2},
+    'qqp': {'equivalent': 1, 'not_equivalent': 0},
+    'qnli': {'entailment': 0, 'not_entailment': 1},
+    'flipkart': {'negative': 0, 'positive': 1, 'neutral': 2},
+    # 'mnli-mm': {'entailment': 0, 'neutral': 1, 'contradiction': 2},
+    'rte': {'entailment': 0, 'not_entailment': 1},
+    'ddxplus': {'spontaneous pneumothorax': 0, 'cluster headache': 1, 'boerhaave': 2, 'spontaneous rib fracture': 3, 'gerd': 4, 'hiv (initial infection)': 5, 'anemia': 6, 'viral pharyngitis': 7, 'inguinal hernia': 8, 'myasthenia gravis': 9, 'whooping cough': 10, 'anaphylaxis': 11, 'epiglottitis': 12, 'guillain-barré syndrome': 13, 'acute laryngitis': 14, 'croup': 15, 'psvt': 16, 'atrial fibrillation': 17, 'bronchiectasis': 18, 'allergic sinusitis': 19, 'chagas': 20, 'scombroid food poisoning': 21, 'myocarditis': 22, 'larygospasm': 23, 'acute dystonic reactions': 24, 'localized edema': 25, 'sle': 26, 'tuberculosis': 27, 'unstable angina': 28, 'stable angina': 29, 'ebola': 30, 'acute otitis media': 31, 'panic attack': 32, 'bronchospasm / acute asthma exacerbation': 33, 'bronchitis': 34, 'acute copd exacerbation / infection': 35, 'pulmonary embolism': 36, 'urti': 37, 'influenza': 38, 'pneumonia': 39, 'acute rhinosinusitis': 40, 'chronic rhinosinusitis': 41, 'bronchiolitis': 42, 'pulmonary neoplasm': 43, 'possible nstemi / stemi': 44, 'sarcoidosis': 45, 'pancreatic neoplasm': 46, 'acute pulmonary edema': 47, 'pericarditis': 48, 'cannot decide': 49},
+    'anli': {'entailment': 0, 'neutral': 1, 'contradiction': 2},
+
+}
+
+ID_TO_LABEL = {
+    'sst2': {0: 'negative', 1: 'positive', 2: 'neutral'},
+    'mnli': {0: 'entailment', 1: 'neutral', 2: 'contradiction'},
+    'qqp': {1: 'equivalent', 0: 'not_equivalent'},
+    'qnli': {0: 'entailment', 1: 'not_entailment'},
+    'flipkart': {0: 'negative', 1: 'positive', 2: 'neutral'},
+    # 'mnli-mm': {0: 'entailment', 1: 'neutral', 2: 'contradiction'},
+    'rte': {0: 'entailment', 1: 'not_entailment'},
+    'ddxplus': {0: 'spontaneous pneumothorax', 1: 'cluster headache', 2: 'boerhaave', 3: 'spontaneous rib fracture', 4: 'gerd', 5: 'hiv (initial infection)', 6: 'anemia', 7: 'viral pharyngitis', 8: 'inguinal hernia', 9: 'myasthenia gravis', 10: 'whooping cough', 11: 'anaphylaxis', 12: 'epiglottitis', 13: 'guillain-barré syndrome', 14: 'acute laryngitis', 15: 'croup', 16: 'psvt', 17: 'atrial fibrillation', 18: 'bronchiectasis', 19: 'allergic sinusitis', 20: 'chagas', 21: 'scombroid food poisoning', 22: 'myocarditis', 23: 'larygospasm', 24: 'acute dystonic reactions', 25: 'localized edema', 26: 'sle', 27: 'tuberculosis', 28: 'unstable angina', 29: 'stable angina', 30: 'ebola', 31: 'acute otitis media', 32: 'panic attack', 33: 'bronchospasm / acute asthma exacerbation', 34: 'bronchitis', 35: 'acute copd exacerbation / infection', 36: 'pulmonary embolism', 37: 'urti', 38: 'influenza', 39: 'pneumonia', 40: 'acute rhinosinusitis', 41: 'chronic rhinosinusitis', 42: 'bronchiolitis', 43: 'pulmonary neoplasm', 44: 'possible nstemi / stemi', 45: 'sarcoidosis', 46: 'pancreatic neoplasm', 47: 'acute pulmonary edema', 48: 'pericarditis', 49: 'cannot decide'},
+}
+
+DATA_PATH = {
+    'advglue': './data/advglue/dev.json',
+    'flipkart': './data/flipkart/flipkart_review.csv',
+    'ddxplus': './data/ddxplus/ddxplus.csv',
+    'anli': './data/anli/test.jsonl',
+    'advglue-t': './data/advglue-t/translation.json',
+}
+
+
+MODEL_SET_TRANS = {
+    'gpt': [
+        # 'text-ada-001',
+        'text-davinci-002',
+        'Helsinki-NLP/opus-mt-en-zh',
+        'liam168/trans-opus-mt-en-zh',
+        'text-davinci-003',
+    ],
+    'chatgpt': [
+        'chatgpt'
+    ]
+}
+
+OPENAI_KEYS = {
+    'api_key': "xxxxxxx",
+    'api_token': "xxxxxxx"
+}
+
+PROMPT_SET2 = {
+    'sst2': [
+        'Is the following sentence positive or negative? Answer me with "positive" or "negative", just one word. ',
+        'Please classify the following sentence into either positive or negative. If it is positive, reply 1, otherwise, reply 0. Just answer me with a single number. ',
+    ],
+    'qqp': [
+        'Are the following two questions equivalent or not? If they are equivalent, answer me with 1, otherwise, answer me with 0. Just answer me with a single number. ',
+    ],
+    'mnli': [
+        'Are the following two sentences entailment, neutral or contradiction? Answer me with 0 if they are entailment, answer me with 2 if they are contradiction, otherwise answer me with 1. ',
+    ],
+    'anli': [
+        'Are the following two sentences entailment, neutral or contradiction? Answer me with 1 if they are entailment, answer me with 0 if they are contradiction, otherwise answer me with 2. ',
+    ],
+    'qnli': [
+        'Are the following question and sentence entailment or not entailment? Answer me with 0 if they are entailment, otherwise 1. ',
+    ],
+    'mnli-mm': [
+        'Are the following two sentences entailment, neutral or contradiction? Answer me with "entailment", "neutral" or "contradiction". ',
+    ],
+    'flipkart': [
+        'Is the following sentence positive, neutral, or negative? Answer me with "positive", "neutral", or "negative", just one word. ',
+    ],
+    'rte': [
+        'Are the following two sentences entailment or not? Answer me with 0 if they are entailment, otherwise answer me with 1. ',
+    ],
+    'ddxplus': [
+        "Imagine you are an intern doctor. Based on the previous dialogue, what is the diagnosis? Select one answer among the following lists: ['spontaneous pneumothorax', 'cluster headache', 'boerhaave', 'spontaneous rib fracture', 'gerd', 'hiv (initial infection)', 'anemia', 'viral pharyngitis', 'inguinal hernia', 'myasthenia gravis', 'whooping cough', 'anaphylaxis', 'epiglottitis', 'guillain-barré syndrome', 'acute laryngitis', 'croup', 'psvt', 'atrial fibrillation', 'bronchiectasis', 'allergic sinusitis', 'chagas', 'scombroid food poisoning', 'myocarditis', 'larygospasm', 'acute dystonic reactions', 'localized edema', 'sle', 'tuberculosis', 'unstable angina', 'stable angina', 'ebola', 'acute otitis media', 'panic attack', 'bronchospasm / acute asthma exacerbation', 'bronchitis', 'acute copd exacerbation / infection', 'pulmonary embolism', 'urti', 'influenza', 'pneumonia', 'acute rhinosinusitis', 'chronic rhinosinusitis', 'bronchiolitis', 'pulmonary neoplasm', 'possible nstemi / stemi', 'sarcoidosis', 'pancreatic neoplasm', 'acute pulmonary edema', 'pericarditis', 'cannot decide']. The answer should be a single word. The answer is: "
+    ],
+    'translation_en_to_zh': [
+        'Translate the following sentence from Engilish to Chinese. '
+    ],
+    'translation_zh_to_en': [
+        'Translate the following sentence from Chinese to English. '
+    ]
+}
@@ -0,0 +1,5 @@
+# AdvGLUE-T
+
+30 sentences selected from AdvGLUE dev set.
+
+The "source" is the source adversarial language and the "target" is the ground-truth translation.