diff --git a/pytorch_fairseq_roberta.md b/pytorch_fairseq_roberta.md
index 8715762..82f6170 100644
--- a/pytorch_fairseq_roberta.md
+++ b/pytorch_fairseq_roberta.md
@@ -16,33 +16,23 @@ demo-model-link: https://huggingface.co/spaces/pytorch/RoBERTa
 ---
 
 
-### Model Description
+### 모델 설명
 
-Bidirectional Encoder Representations from Transformers, or [BERT][1], is a
-revolutionary self-supervised pretraining technique that learns to predict
-intentionally hidden (masked) sections of text. Crucially, the representations
-learned by BERT have been shown to generalize well to downstream tasks, and when
-BERT was first released in 2018 it achieved state-of-the-art results on many NLP
-benchmark datasets.
+Bidirectional Encoder Representations from Transformers, [BERT][1]는 텍스트에서 의도적으로 숨겨진(masked) 부분을 예측하는 학습에 획기적인 self-supervised pretraining 기술이다. 결정적으로 BERT가 학습한 표현은 downstream tasks에 잘 일반화되는 것으로 나타났으며, BERT가 처음 출시된 2018년에 많은 NLP benchmark datasets에서 state-of-the-art 결과를 달성했다.
 
-[RoBERTa][2] builds on BERT's language masking strategy and modifies key
-hyperparameters in BERT, including removing BERT's next-sentence pretraining
-objective, and training with much larger mini-batches and learning rates.
-RoBERTa was also trained on an order of magnitude more data than BERT, for a
-longer amount of time. This allows RoBERTa representations to generalize even
-better to downstream tasks compared to BERT.
+[RoBERTa][2]는 BERT의 language masking strategy를 기반으로 구축되며, BERT의 next-sentence pretraining objective를 제거하고 훨씬 더 큰 미니 배치와 학습 속도로 훈련하는 등 주요 하이퍼파라미터를 수정한다. 또한 RoBERTa는 더 오랜 시간 동안 BERT보다 훨씬 많은 데이터에 대해 학습되었다. 이를 통해 RoBERTa의 표현은 BERT와 비교해 downstream tasks을 훨씬 잘 일반화할 수 있다.
 
 
-### Requirements
+### 요구 사항
 
-We require a few additional Python dependencies for preprocessing:
+전처리 과정을 위해 추가적인 Python 의존성이 필요합니다.
 
 ```bash
 pip install regex requests hydra-core omegaconf
 ```
 
 
-### Example
+### 예시
 
 ##### Load RoBERTa
 ```python
@@ -95,7 +85,7 @@ logprobs = roberta.predict('new_task', tokens)  # tensor([[-1.1050, -1.0672, -1.
 ```
 
 
-### References
+### 참고
 
 - [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding][1]
 - [RoBERTa: A Robustly Optimized BERT Pretraining Approach][2]