Skip to content

Commit 97e3e13

Browse files
authored
remove references to PTB perplexity numbers
removing them until we can give appropriate command-line usage that allows users to try the script on PTB again
1 parent c5985a8 commit 97e3e13

File tree

1 file changed

+4
-8
lines changed

1 file changed

+4
-8
lines changed

word_language_model/README.md

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -45,12 +45,8 @@ With these arguments, a variety of models can be tested.
4545
As an example, the following arguments produce slower but better models:
4646

4747
```bash
48-
python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40 # Test perplexity of 80.97
49-
python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40 --tied # Test perplexity of 75.96
50-
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 # Test perplexity of 77.42
51-
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied # Test perplexity of 72.30
48+
python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40
49+
python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40 --tied
50+
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40
51+
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied
5252
```
53-
54-
Perplexities on PTB are equal or better than
55-
[Recurrent Neural Network Regularization (Zaremba et al. 2014)](https://arxiv.org/pdf/1409.2329.pdf)
56-
and are similar to [Using the Output Embedding to Improve Language Models (Press & Wolf 2016](https://arxiv.org/abs/1608.05859) and [Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling (Inan et al. 2016)](https://arxiv.org/pdf/1611.01462.pdf), though both of these papers have improved perplexities by using a form of recurrent dropout [(variational dropout)](http://papers.nips.cc/paper/6241-a-theoretically-grounded-application-of-dropout-in-recurrent-neural-networks).

0 commit comments

Comments
 (0)