'DATA ANALYSIS/Paper' 카테고리의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

목록DATA ANALYSIS/Paper (4)

HAZEL

[NLP Paper Review] RoBERTa: A Robustly Optimized BERT Pretraining Approach 논문 리뷰 / RoBERTa

https://arxiv.org/abs/1907.11692 RoBERTa: A Robustly Optimized BERT Pretraining Approach Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperpar arxiv.org 논문 발표 PPT

DATA ANALYSIS/Paper 2021. 10. 29. 23:47

[NLP Paper Review] ALBERT: A Lite BERT for Self-supervised Learning of Language Representations 논문 리뷰 / ALBERT

NLP 논문 스터디에서 발표한 내용으로, PPT만 있는 글 입니다. - 추후에 설명 글도 첨가할 예정 ** arxiv.org/abs/1909.11942 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer..

DATA ANALYSIS/Paper 2021. 5. 9. 11:02

[NLP Paper Review] Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context 논문 리뷰 / Transformer-XL

NLP 논문 스터디에서 발표한 내용으로, PPT만 있는 글 입니다. - 추후에 설명 글도 첨가할 예정 ** arxiv.org/abs/1901.02860 Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a ..

DATA ANALYSIS/Paper 2021. 5. 8. 09:48

[NLP Paper Review] Attention is all you need 논문 리뷰 / transformer

NLP 논문 스터디에서 발표한 내용으로, PPT만 있는 글 입니다. - 추후에 설명 글도 첨가할 예정 ** arxiv.org/abs/1706.03762 Attention Is All You Need The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new arxiv.org 논문 발표 PPT

DATA ANALYSIS/Paper 2021. 5. 5. 17:06

이전 Prev 1 Next 다음

목록DATA ANALYSIS/Paper (4)

HAZEL

티스토리툴바