Elasticsearch has a thread pool and a queue for search per node. A thread pool will have N number of workers ready to handle the requests. When a request comes and if a worker is free , this is handled by the worker. Now by default the number of workers is equal to the number of cores on that CPU. When the workers are full and there are more search requests , the request will go to queue. The size of queue is also limited. Its by default size is say 100 and if there happens more parallel requests than this , then those requests would be rejected as you can see in the error log.The solution to this would be to -1. Increase the size of queue or threadpool - The immediate solution for this would be to increase the size of the search queue. We can also increase the size of threadpool , but then that might badly effect the performance of individual queries. So increasing the queue might be a good idea. But then remember that this queue is memory residential and increasing the queue size too much can result in Out Of Memory issues. You can get more info on the same here.2. Increase number of nodes and replicas - Remember each node has its own search threadpool/queue. Also search can happen on primary shard OR replica.