AI NEWS HUBbyEIGENVECTOREigenvector

Downstream Evaluations of Rotary Position Embeddings

EleutherAI Blogby Leo GaoAugust 16, 20213 min read2 views
Source Quiz

A comparison of Rotary Position Embedding against GPT-style learned position embeddings.

A head-to-head comparison of Rotary Position Embedding and GPT-style learned position embeddings. Both 1.3B models were trained for 100k steps on the Pile using Mesh Transformer JAX. There isn't a very strong trend, but hopefully someone will find these results useful regardless.

Task Metric Learned Rotary

lambada ppl 7.940 ± 0.208 7.156 ± 0.208

acc 0.556 ± 0.007 0.567 ± 0.007

piqa acc 0.700 ± 0.011 0.714 ± 0.011

acc_norm 0.693 ± 0.011 0.709 ± 0.011

hellaswag acc 0.376 ± 0.005 0.389 ± 0.005

acc_norm 0.472 ± 0.005 0.488 ± 0.005

winogrande acc 0.540 ± 0.014 0.571 ± 0.014

mathqa acc 0.231 ± 0.008 0.230 ± 0.008

acc_norm 0.234 ± 0.008 0.227 ± 0.008

pubmedqa acc 0.599 ± 0.015 0.583 ± 0.015

boolq acc 0.575 ± 0.009 0.614 ± 0.009

anli_r3 acc 0.344 ± 0.014 0.351 ± 0.014

openbookqa acc 0.198 ± 0.018 0.206 ± 0.018

acc_norm 0.316 ± 0.021 0.330 ± 0.021

triviaqa acc 0.041 ± 0.002 0.026 ± 0.002

arc_challenge acc 0.235 ± 0.012 0.230 ± 0.012

acc_norm 0.260 ± 0.013 0.272 ± 0.013

arc_easy acc 0.564 ± 0.010 0.568 ± 0.010

acc_norm 0.505 ± 0.010 0.486 ± 0.010

cb acc 0.375 ± 0.065 0.357 ± 0.065

cola mcc 0.042 ± 0.034 0.022 ± 0.034

copa acc 0.730 ± 0.044 0.730 ± 0.044

ethics_cm acc 0.491 ± 0.008 0.480 ± 0.008

ethics_deontology acc 0.497 ± 0.008 0.497 ± 0.008

ethics_justice acc 0.501 ± 0.010 0.501 ± 0.010

ethics_utilitarianism acc 0.497 ± 0.007 0.493 ± 0.007

ethics_virtue acc 0.200 ± 0.006 0.200 ± 0.006

headqa acc 0.227 ± 0.008 0.224 ± 0.008

acc_norm 0.270 ± 0.008 0.271 ± 0.008

logiqa acc 0.221 ± 0.016 0.215 ± 0.016

acc_norm 0.293 ± 0.018 0.283 ± 0.018

mnli acc 0.344 ± 0.005 0.344 ± 0.005

mnli_mismatched acc 0.345 ± 0.005 0.349 ± 0.005

mrpc acc 0.684 ± 0.023 0.684 ± 0.023

f1 0.812 ± 0.017 0.812 ± 0.017

qa4mre_2011 acc 0.392 ± 0.045 0.358 ± 0.045

acc_norm 0.450 ± 0.045 0.433 ± 0.045

qa4mre_2012 acc 0.287 ± 0.036 0.312 ± 0.036

acc_norm 0.394 ± 0.039 0.400 ± 0.039

qa4mre_2013 acc 0.335 ± 0.028 0.335 ± 0.028

acc_norm 0.352 ± 0.028 0.349 ± 0.028

qnli acc 0.498 ± 0.007 0.517 ± 0.007

qqp acc 0.370 ± 0.002 0.368 ± 0.002

f1 0.538 ± 0.003 0.538 ± 0.003

race acc 0.345 ± 0.015 0.343 ± 0.015

record f1 0.805 ± 0.004 0.813 ± 0.004

em 0.797 ± 0.004 0.805 ± 0.004

rte acc 0.538 ± 0.030 0.523 ± 0.030

sciq acc 0.867 ± 0.011 0.865 ± 0.011

acc_norm 0.796 ± 0.013 0.771 ± 0.013

sst acc 0.572 ± 0.017 0.519 ± 0.017

webqs acc 0.021 ± 0.003 0.006 ± 0.003

wic acc 0.500 ± 0.020 0.498 ± 0.020

wnli acc 0.437 ± 0.059 0.549 ± 0.059

wsc acc 0.365 ± 0.047 0.365 ± 0.047

wsc273 acc 0.722 ± 0.027 0.736 ± 0.027

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

valuation

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Downstream …valuationEleutherAI …

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 184 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!