Skip to content

Commit 2798fda

Browse files
Update results on README after Cleaning up code (#41)
* Adding SentencePiece * Rearrange * Adding finetuning scripts * Making dir structure more managable * Add Random Search * Add majority voting * Strip notebooks * Add nb stripout from fastai standalone version * Move everything to one notebook * Change the name of the file saved * Change the parameters * Combine common code * Pass "name" to methods in hinglish utils * fix imports * Pass name as variable to add_padding * Remove output * Fix typo * Change names for ensemble models * Change from "output" to name of the LM model * Fix typo * Remove hardcoding for epochs * isort and black :) * Move everything to hinglish * Coffe isn't good for me * Break everything into sensible methods * import clash * fix run_valid * Change notebook according to hinglish.py * Change logs * Fix imports * Change method names * Remove tf dependency * Remove tf from requirements.txt * Remove hardcoding from pd columns * Use store_attr to load variables for class * Fix the size mismatch error by changing final_test.json file * nb-stripout worked * Add majority voting explanation * extract if tarfile and run_language_modeling documentation * Split the transformers notebook * Something broke, I don't know what. * Remove setup * Things would be easier if I knew OOP or Python better * Will fix this later is this works¿ * nan sent¿ * Print eval and test metrics * Change the label for eval * Change the file with empty clean_text * Remove eval testing for now * Logfile name * moving the part which copies things to drive here * Fix formatting add pathlib * add drivepath * Changed the file paths * Remove additional code * Update model performance after reproducing code (#40) - Reproduced on Monday 29th December. - [Training Files](https://drive.google.com/drive/folders/12qEbxbefBY24-YqahVV0v7q_IFyxz3L8?usp=sharing) - [Model/Output files](https://drive.google.com/drive/folders/1x-6klxSJEQu5gUOR1zHUHyHjHrApKmRD?usp=sharing) Co-authored-by: meghanabhange <[email protected]> Co-authored-by: meghanabhange <[email protected]> Co-authored-by: meghanabhange <[email protected]>
1 parent a624c16 commit 2798fda

File tree

1 file changed

+16
-3
lines changed

1 file changed

+16
-3
lines changed

README.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,25 @@ RoBERTa| 7.54 | 0.64|
1717

1818
Base LM | Dataset| Accuracy | Precision | Recall | F1| LM Perplexity|
1919
--|--|--|--|--|--|--|
20-
bert-base-multilingual-cased | Test | 0.686| 0.695| 0.683| 0.685| 8.2|
20+
bert-base-multilingual-cased | Test | 0.688| 0.698| 0.686| 0.687| 8.2|
2121
bert-base-multilingual-cased | Valid | 0.62| 0.592 | 0.605| 0.55| 8.2|
22-
distilbert-base-uncased | Test| 0.671| 0.671| 0.691| 0.677| 6.51|
22+
distilbert-base-uncased | Test| 0.693| 0.694| 0.703| 0.698| 6.51|
2323
distilbert-base-uncased | Valid| 0.607| 0.614| 0.600| 0.592| 6.51|
2424
distilbert-base-multilingual-cased | Test| 0.612| 0.615| 0.616| 0.616| 8.1|
2525
distilbert-base-multilingual-cased | Valid| 0.55| 0.531| 0.537| 0.495| 8.1|
2626
roberta-base | Test| 0.630| 0.629| 0.644| 0.635| 7.54|
2727
roberta-base | Valid| 0.60| 0.617| 0.607| 0.595| 7.54|
28-
Ensemble | Test| 0.713| 0.715| 0.722| 0.718| |
28+
Ensemble | Test| 0.714| 0.718| 0.718| 0.718| |
29+
30+
## Ensemble Performace
31+
32+
Model | Accuracy | Precision | Recall | F1 | Config | Link to Model and output files |
33+
--|--|--|--|--|--|--|
34+
BERT | 0.68866 | 0.69821 | 0.68608 | 0.6875 | Batch Size - 16<br>Attention Dropout - 0.4<br>Learning Rate - 5e-07<br>Adam epsilon - 1e-08<br>Hidden Dropout Probability - 0.3<br>Epochs - 3 | [BERT](https://drive.google.com/drive/folders/1HAYoWX3zG7XEMaSf74K5dKvaBJdwE6U9?usp=sharing) |
35+
DistilBert | 0.69333 | 0.69496 | 0.70379 | 0.6982 | Batch Size - 16<br>Attention Dropout - 0.6<br>Learning Rate - 3e-05<br>Adam epsilon - 1e-08<br>Hidden Dropout Probability - 0.6<br>Epochs - 3 | [DistilBert](https://drive.google.com/drive/folders/1t_2XqwtRpui5l1prZsmCaArmqzPjPGob?usp=sharing) |
36+
EnsembleBert1 | 0.69233 | 0.70236 | 0.69064 | 0.68952 | Batch Size - 4<br>Attention Dropout - 0.7<br>Learning Rate - 5.01e-05<br>Adam epsilon - 4.79e-05<br>Hidden Dropout Probability - 0.1<br>Epochs - 3 | [EnsembleBert1](https://drive.google.com/drive/folders/1-ais3Y04SWFUYHF4KkUAJMDUEdsfu_GB?usp=sharing) |
37+
EnsembleBert2 | 0.691 | 0.7009 | 0.6889 | 0.68872 | Batch Size - 4<br>Attention Dropout - 0.6<br>Learning Rate - 5.13e-05<br>Adam epsilon - 9.72e-05<br>Hidden Dropout Probability - 0.2<br>Epochs - 3 | [EnsembleBert2](https://drive.google.com/drive/folders/1-rpWWvVruIp_WA0mveU2zHn82fZ5Mcl8?usp=sharing) |
38+
EnsembleDistilBert1 | 0.70166 | 0.70377 | 0.70976 | 0.7061 | Batch Size - 16<br>Attention Dropout - 0.8<br>Learning Rate - 3.02e-05<br>Adam epsilon - 9.35e-05<br>Hidden Dropout Probability - 0.4<br>Epochs - 3 | [EnsembleDistilBert1](https://drive.google.com/drive/folders/1jqcXPLysVSVCOh5ySKa-fMRIWA_djT_P?usp=sharing) |
39+
EnsembleDistilBert2 | 0.689 | 0.691 | 0.69666 | 0.69335 | Batch Size - 4<br>Attention Dropout - 0.6<br>Learning Rate - 5.13e-05<br>Adam epsilon - 9.72e-05<br>Hidden Dropout Probability - 0.2<br>Epochs - 3 | [EnsembleDistilBert2](https://drive.google.com/drive/folders/1-3mwr1v3OBzlSrFxOKpec8ERrqHTPaZo?usp=sharing) |
40+
EnsembleDistilBert3 | 0.69366 | 0.69538 | 0.70557 | 0.69905 | Batch Size - 16<br>Attention Dropout - 0.4<br>Learning Rate - 4.74e-05<br>Adam epsilon - 4.09e-05<br>Hidden Dropout Probability - 0.6<br>Epochs - 3 | [EnsembleDistilBert3](https://drive.google.com/drive/folders/1-KHIKd425T98r0lMjKCv0X7GKaU7K9D5?usp=sharing) |
41+
Ensemble | 0.71466 | 0.71867 | 0.71853 | 0.7182 | NA | [Ensemble](https://drive.google.com/drive/folders/12Iz0xfxszNMkQE8hxO6ajeTBACoKsWUW?usp=sharing) |

0 commit comments

Comments
 (0)