OpenAI GPT API, Anthropic Claude API and Gemini API response time tracker

The charts below track the response times of the main large language model APIs: OpenAI (gpt-4o, gpt-4-turbo, gpt-4, gpt-3.5-turbo), Anthropic Claude (claude-3-5-sonnet) and Gemini (gemini-1.5-flash, gemini-1.5-pro).

The response times are measured by generating a maximum of 512 tokens with a randomized prompt every 10 minutes in 3 locations. The maximum response time is capped at 60 seconds but could be higher in reality.

OpenAI GPT APIs

GPT for Work

Anthropic Claude APIs

GPT for Work

Gemini APIs

GPT for Work

How to get a faster response time?

  • Choose a model with a faster response time
  • Try again outside of peak hours
  • Break down your executions into smaller ones


We are not affiliated with OpenAI, Anthropic or Gemini.
Please refer to their official status pages for official information:
https://status.openai.com/
https://status.anthropic.com/
https://google.com/appsstatus