Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessOpenAI closes larger than expected funding round of $122bnSilicon RepublicIran threatens attacks on Nvidia, Microsoft, Intel, and other US tech firms in the Middle EastTechSpotAI tools are great for individuals. but what about your team?DEV CommunityOpenAI: We’re generating $2 billion a month - thestack.technologyGoogle News: OpenAIBeyond Human Wisdom: Can Humanity Survive the Rise of AGI?LessWrong AICreate a workspace scheduler using Bryntum Scheduler Pro and MongoDBDEV CommunityNvidia commits billions to Lumentum, Synopsys, Nokia, XAI, OpenAI, Intel in March alone - 24/7 Wall St.Google News: OpenAIDiscover a Free AI Voice Tool with Emotional Control for Content CreatorsDEV CommunitySeatGeek launches its app in ChatGPT - IQ MagazineGoogle News: ChatGPTI tested denim jackets from Banana Republic, Old Navy, and Gap. One became my new closet staple.Business InsiderReact 20 Is Coming. Here's What Actually Matters (and What Doesn't).DEV CommunityAsync/Await in JavaScript: Writing Cleaner Asynchronous CodeDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessOpenAI closes larger than expected funding round of $122bnSilicon RepublicIran threatens attacks on Nvidia, Microsoft, Intel, and other US tech firms in the Middle EastTechSpotAI tools are great for individuals. but what about your team?DEV CommunityOpenAI: We’re generating $2 billion a month - thestack.technologyGoogle News: OpenAIBeyond Human Wisdom: Can Humanity Survive the Rise of AGI?LessWrong AICreate a workspace scheduler using Bryntum Scheduler Pro and MongoDBDEV CommunityNvidia commits billions to Lumentum, Synopsys, Nokia, XAI, OpenAI, Intel in March alone - 24/7 Wall St.Google News: OpenAIDiscover a Free AI Voice Tool with Emotional Control for Content CreatorsDEV CommunitySeatGeek launches its app in ChatGPT - IQ MagazineGoogle News: ChatGPTI tested denim jackets from Banana Republic, Old Navy, and Gap. One became my new closet staple.Business InsiderReact 20 Is Coming. Here's What Actually Matters (and What Doesn't).DEV CommunityAsync/Await in JavaScript: Writing Cleaner Asynchronous CodeDEV Community

Activation Function Ablation

EleutherAI BlogMay 24, 20211 min read0 views
Source Quiz

An ablation of activation functions in GPT-like autoregressive language models.

This was an ablation of activation functions on GPT-like models of ~100M params that I ran ages ago. Each model was run for 10k iters, which isn't very long. My original goal was to show that activation function doesn't matter than much, but to do so I'd need to run a bunch more runs to get variance and show no statistical significance, and I don't plan on running a more exhaustive version of this experiment any time soon. So, I'm just dumping these results here in case anyone has any use for them. All the activation definitions are here.

Name Pile Validation BPB LAMBADA acc LAMBADA ppl

softsign 1.1485 34.3 81.32

ReLU 1.1482 34.3 82.01

spike2 1.1480 34.4 83.13

selu 1.1485 34.5 83.32

elish 1.1492 33.9 84.04

tanhexp 1.1474 33.7 84.06

sigmoid 1.1484 33.9 85.20

tanhshrink 1.1483 33.9 85.42

maxtanh 1.1479 33.7 85.53

roottanh 1.1485 33.4 86.00

softplusmone 1.1488 34.1 86.21

logsoftmax 1.1492 34.2 86.29

ELU 1.1496 33.8 86.37

Swish 1.1482 33.7 86.42

softmax 1.1491 33.2 86.74

square_relax 1.1484 33.5 86.92

lisht 1.1500 33.8 87.17

GELU 1.1453 34.0 87.84

abs 1.1489 33.5 87.96

tanh 1.1481 33.2 89.28

Mish 1.1482 33.6 89.84

triangle_relax 1.1502 33.7 89.91

seagull 1.1487 33.3 90.08

maxsig 1.1480 33.3 90.23

softplus 1.1460 33.1 90.74

minsin 1.1498 33.3 91.18

snake 1.1484 33.1 91.93

cosid 1.1490 33.3 92.99

spike 1.1498 33.3 93.78

bipolarsigmoid 1.1513 32.8 96.73

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage model

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Activation …modellanguage mo…EleutherAI …

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 213 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models