Compare OpenAI, Azure, Anthropic model APIs

Easily compare the most popular Gen AI LLM APIs across tokens limit, price, rate limits, latency, language and training data cutoff.

Provider
OpenAI
OpenAI
OpenAI
OpenAI
OpenAI
OpenAI
Azure
Azure
Azure
Azure
Azure
Azure
Azure
Anthropic
Anthropic
Model
gpt-4-turbo
gpt-4
gpt-4-32k
gpt-3.5-turbo
gpt-3.5-turbo-instruct
text-embedding-ada-002
gpt-4-turbo
gpt-4
gpt-4-32k
gpt-3.5-turbo
gpt-3.5-turbo-instruct
gpt-3.5-turbo-16k
text-embedding-ada-002
claude-instant-1
claude-2
Max tokens
128,000
8,192
32,768
16,384
4,096
8,191
128,000
8,192
32,768
4,096
4,096
16,384
8,191
100,000
100,000
Input price per 1M tokens
$10.00
$30.00
$60.00
$1.00
$1.50
$0.10
$10.00
$30.00
$60.00
$1.50
$1.50
$3.00
$0.10
$1.63
$11.02
Output price per 1M tokens
$30.00
$60.00
$120.00
$2.00
$2.00
N/A
$30.00
$60.00
$120.00
$2.00
$2.00
$4.00
N/A
$5.51
$32.68
Default requests per minute
20
OpenAI automatically raises your RPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 3 RPM
Tier 1$5 paid500 RPM
Tier 2$50 paid and 7+ days since first payment5,000 RPM
Tier 3$100 paid and 7+ days since first payment5,000 RPM
Tier 4$250 paid and 14+ days since first payment10,000 RPM
Tier 5$1,000 paid and 30+ days since first payment10,000 RPM
500
OpenAI automatically raises your RPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 3 RPM
Tier 1$5 paid500 RPM
Tier 2$50 paid and 7+ days since first payment5,000 RPM
Tier 3$100 paid and 7+ days since first payment5,000 RPM
Tier 4$250 paid and 14+ days since first payment10,000 RPM
Tier 5$1,000 paid and 30+ days since first payment10,000 RPM
500
OpenAI automatically raises your RPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 3 RPM
Tier 1$5 paid3,500 RPM
Tier 2$50 paid and 7+ days since first payment3,500 RPM
Tier 3$100 paid and 7+ days since first payment3,500 RPM
Tier 4$250 paid and 14+ days since first payment10,000 RPM
Tier 5$1,000 paid and 30+ days since first payment10,000 RPM
3,500
OpenAI automatically raises your RPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 3 RPM
Tier 1$5 paid3,500 RPM
Tier 2$50 paid and 7+ days since first payment3,500 RPM
Tier 3$100 paid and 7+ days since first payment3,500 RPM
Tier 4$250 paid and 14+ days since first payment10,000 RPM
Tier 5$1,000 paid and 30+ days since first payment10,000 RPM
3,500
OpenAI automatically raises your RPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 3 RPM
Tier 1$5 paid500 RPM
Tier 2$50 paid and 7+ days since first payment500 RPM
Tier 3$100 paid and 7+ days since first payment5,000 RPM
Tier 4$250 paid and 14+ days since first payment10,000 RPM
Tier 5$1,000 paid and 30+ days since first payment10,000 RPM
500
20
120
360
1,440
1,440
1,440
1,440
N/A
N/A
Default tokens per minute
10,000
OpenAI automatically raises your TPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 10,000 RPM
Tier 1$5 paid20,000 RPM
Tier 2$50 paid and 7+ days since first payment40,000 RPM
Tier 3$100 paid and 7+ days since first payment80,000 RPM
Tier 4$250 paid and 14+ days since first payment300,000 RPM
Tier 5$1,000 paid and 30+ days since first payment300,000 RPM
20,000
OpenAI automatically raises your TPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 10,000 RPM
Tier 1$5 paid20,000 RPM
Tier 2$50 paid and 7+ days since first payment40,000 RPM
Tier 3$100 paid and 7+ days since first payment80,000 RPM
Tier 4$250 paid and 14+ days since first payment300,000 RPM
Tier 5$1,000 paid and 30+ days since first payment300,000 RPM
20,000
OpenAI automatically raises your TPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 20,000 RPM
Tier 1$5 paid40,000 RPM
Tier 2$50 paid and 7+ days since first payment80,000 RPM
Tier 3$100 paid and 7+ days since first payment160,000 RPM
Tier 4$250 paid and 14+ days since first payment1,000,000 RPM
Tier 5$1,000 paid and 30+ days since first payment1,000,000 RPM
40,000
OpenAI automatically raises your TPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 20,000 RPM
Tier 1$5 paid40,000 RPM
Tier 2$50 paid and 7+ days since first payment80,000 RPM
Tier 3$100 paid and 7+ days since first payment160,000 RPM
Tier 4$250 paid and 14+ days since first payment1,000,000 RPM
Tier 5$1,000 paid and 30+ days since first payment1,000,000 RPM
40,000
OpenAI automatically raises your TPM limit
once you reach these consumption tiers:
TierRequirementLimit
Free 150,000 RPM
Tier 1$5 paid1,000,000 RPM
Tier 2$50 paid and 7+ days since first payment1,000,000 RPM
Tier 3$100 paid and 7+ days since first payment5,000,000 RPM
Tier 4$250 paid and 14+ days since first payment5,000,000 RPM
Tier 5$1,000 paid and 30+ days since first payment10,000,000 RPM
1,000,000
10,000
20,000
60,000
240,000
240,000
240,000
240,000
N/A
N/A
Avg latency in the last 48h*
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
Training data cutoff
Apr 2023
Sep 2021
Sep 2021
Sep 2021
Sep 2021
Sep 2021
Apr 2023
Sep 2021
Sep 2021
Sep 2021
Sep 2021
Sep 2021
Sep 2021
Late 2021
Early 2023
Languages available
All
All
All
All
All
All
All
All
All
All
All
All
All
All
All
*The average latency is calculated by generating up to 512 tokens at a temperature of 0.7 every 10 minutes across 3 different locations. Check out our latency tracker.

FAQ

What is “Max tokens”?

The maximum number of tokens that the model can process in a single request. This limit includes both the input (prompt) and the output (completion) tokens.

What is “Input price per 1M tokens”?

The cost of processing 1 million input (prompt) tokens using this model.

What is “Output price per 1M tokens”?

The cost of generating 1 million output (completion) tokens using this model.

What is “Default requests per minute”?

The maximum number of requests per minute that can be made within one minute to this API with a default rate limit. Some providers offer the option to increase this limit upon request.

What is “Default tokens per minute”?

The maximum number of tokens (input+output) that can be processed within one minute by this model API. Some providers offer the option to increase this limit upon request.

What is “Avg latency in the last 48h”?

The average response time of the model API, measured by generating a maximum of 512 tokens at a temperature of 0.7 every 10 minutes in 3 locations, during the last 48h. The maximum response time is capped at 60 seconds but could be higher in reality.

What is “Training data cutoff”?

The training data cutoff date is the date of the latest knowledge the model has. The model cannot know anything that happened or was published after this date.

What is “Languages available”?

The languages that the model can be prompted in and generate text in.

OpenAI

Established in 2015, OpenAI is an American research laboratory dedicated to the advancement of artificial intelligence (AI). With a strong emphasis on the development of safe and beneficial artificial general intelligence (AGI), OpenAI aims to create highly autonomous systems capable of surpassing human performance in economically valuable tasks. The organization was co-founded by Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, Jessica Livingston, John Schulman, Pamela Vagata, and Wojciech Zaremba, with Sam Altman and Elon Musk serving as the initial board members. In 2019, OpenAI received a $1 billion investment from Microsoft, followed by a $10 billion investment in 2023.

gpt-4-turbo

With 128k context, fresher knowledge and the broadest set of capabilities, GPT-4 Turbo is more powerful than GPT-4 and offered at a lower price.

gpt-4

More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat.

gpt-4-32k

Same capabilities as the base gpt-4 mode but with 4x the context length.

gpt-3.5-turbo-16k

Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003.

gpt-3.5-turbo-instruct

gpt-3.5-turbo-instruct is an Instruct model and only supports a 4K context window.

text-embedding-ada-002

OpenAI’s second generation embedding model, text-embedding-ada-002 is designed to replace the previous 16 first-generation embedding models at a fraction of the cost.

Azure AI

Azure OpenAI Service gives customers advanced language AI with OpenAI GPT-4, GPT-3, Codex, and DALL-E models with the security and enterprise promise of Azure. Azure OpenAI co-develops the APIs with OpenAI, ensuring compatibility and a smooth transition from one to the other.

gpt-3.5-turbo

Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003.

gpt-3.5-turbo-16k

Same capabilities as the standard gpt-3.5-turbo model but with 4 times the context.

gpt-4

More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat.

gpt-4-32k

Same capabilities as the base gpt-4 mode but with 4x the context length.

text-embedding-ada-002

OpenAI’s second generation embedding model, text-embedding-ada-002 is designed to replace the previous 16 first-generation embedding models at a fraction of the cost.

Anthropic

Anthropic PBC, a US-based startup and public-benefit corporation, was founded by former members of OpenAI. Specializing in the development of general AI systems and language models, Anthropic operates with a strong commitment to responsible AI usage. As of July 2023, Anthropic has successfully raised $1.5 billion in funding.

claude-2

Their most powerful model, which excels at a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction.

claude-instant-1

A faster, cheaper yet still very capable model, which can handle a range of tasks including casual dialogue, text analysis, summarization, and document comprehension.