🔥 Burn Fat Fast. Discover How! 💪

Thundering Herd Imagine that popular news has just appeared o | L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Thundering Herd

Imagine that popular news has just appeared on the front page of a news site. The news has not been loaded into the cache yet. After it's published on the front page of the website, it is very likely that many users will click on the new article, and they will all do so at approximately the same time.

Because the application server processes all user requests at the same time and all user requests will evaluate the contents of the cache at approximately the same time, they will all come to the same conclusion — miss!

Next step, all user request threads will proceed to read data from the database. Thus, even though developers has been careful to implement caching in the application, the database is still prone to bursts of activity.

This is known generally as the "Thundering Herd" problem:

A large number of processes waiting for an event wake up when a certain event occurs, but only one process can continue running at a time. After processes wake up all of them require resources and it is necessary to decide which process may continue working. Once a decision has been made, the remaining processes are put to sleep again, and then wake up again to request access to the resource.

This is repeated until there are no more processes left to wake up. Since all processes use system resources when waking up, it is more efficient if only one process wakes up at a time.

Simply speaking, when the programmer controls something like a herd of threads (analogous to a herd of animals) with a semaphore (as an alternative), he wakes up the whole herd of threads for a single process (such as an incoming client connection). This problem is solved by writing a scheduler, which will decide which of the pool of slumbering threads should perform a certain task.

#big_data