OpenAI API and other LLM APIs response time tracker

The 3 charts below track the response times of the main large language model APIs: OpenAI (GPT-4, GPT-3.5, GPT-3) and Anthropic Claude.

The response times are measured by generating a maximum of 512 tokens at a temperature of 0.7 every 10 minutes in 3 locations. The maximum response time is capped at 60 seconds but could be higher in reality.


GPT for Work

Anthropic Claude APIs

GPT for Work

How to get a faster response time?

  • Choose a model with a faster response time
  • Try again outside of peak hours
  • Break down your executions into smaller ones

We are not affiliated with OpenAI or Anthropic.
Please refer to their official status pages for official information: