API response time / request parallelization

Hi team,

we're noticing that the Personio API is often a bottleneck in our systems relying on it. Our metrics are telling us that the response time usually averages around 2 seconds, but quite distributed (lots of requests are much quicker, but some can even take 30+ seconds).

This is especially problematic as we can only send one request at a time (as we need to wait for the new token to come back), so if one request takes 30 seconds, all subsequent requests need to wait for that to complete (at which points clients usually time out).

Apart from very aggressive caching on our clients, is there any way we can work around this? We are generally quite happy with the API functionality, but we often have to work with stale, cached data, as we can't wait 30+ seconds for data to be returned.

For background, all of our requests run through a gateway that proxies the requests and handles authentication (getting a token on startup, and then reading and saving it from the response, only allowing one request at a time). As multiple clients can be created by now, we could run multiple instances and balance the load between them, but that's not really the nicest solution and I'm also not sure if it would actually help much.

Any ideas? Thanks!

Laura