Ciro Santilli
OurBigBook.com
$£
Sponsor
中国
独裁统治 China Dictatorship 新疆改造中心、六四事件、法轮功、郝海东、709大抓捕、2015巴拿马文件 邓家贵、低端人口、西藏骚乱
GPT-2
(124 M parameters, 2019-11-05)
...
Text-to-text model
Large language model
Generative pre-trained transformer
GPT model
List of GPT models
GPT model by OpenAI
OurBigBook.com
Words: 29
Articles: 8
Vocabulary size (V): 50,257
Hidden size (d_model): 768
Context length (n_ctx): 1024
Q V size: (d_head): 64
Attention heads (h): 12
FFN inner size (d_ff): 3072
Layers (L): 12
Table of contents
29
8
Language Models are Unsupervised Multitask Learners
GPT-2
GPT-2 implementation
GPT-2
GPT-2 implementation in PyTorch
GPT-2
1
nanoGPT
GPT-2 implementation in PyTorch
GPT-2 variant
GPT-2
3
GPT-2 medium
GPT-2 variant
GPT-2 large
GPT-2 variant
GPT-2 XL
GPT-2 variant
Ancestors
(17)
GPT model by OpenAI
List of GPT models
GPT model
Generative pre-trained transformer
Large language model
Text-to-text model
AI text generation
Generative AI by modality
Generative AI
AI by capability
Artificial intelligence
Machine learning
Computer
Information technology
Area of technology
Technology
Home
Incoming links
(1)
Number of multiplications per token in a GPT model