OpenAI API and other LLM APIs response time tracker
The 3 charts below track the response times of the main large language model APIs: OpenAI and Azure OpenAI (GPT-4, GPT-3.5, GPT-3), Anthropic Claude and Google PaLM.
The response times are measured by generating a maximum of 512 tokens at a temperature of 0.7 every 10 minutes in 3 locations. The maximum response time is capped at 60 seconds but could be higher in reality.
OpenAI and Azure OpenAI APIs
Google PaLM APIs
Anthropic Claude APIs
How to get a faster response time?
- Choose a model with a faster response time
- Try again outside of peak hours
- Break down your executions into smaller ones
We are not affiliated with OpenAI, Microsoft, Google or Anthropic.
Please refer to their official status pages for official information:
https://status.openai.com/
https://azure.status.microsoft/en-gb/status
https://status.cloud.google.com/