Setting: you are sending bits through a communication channel, each bit has a random probability of getting flipped, and so you use some error correction code to achieve some minimal error, at the expense of longer messages.

This theorem sets an upper bound on how efficient you can be in your encoding, for any encoding.

The next big question, which the theorem does not cover is how to construct codes that reach or approach the limit. Important such codes include:

But besides this, there is also the practical consideration of if you can encode/decode fast enough to keep up with the coded bandwidth given your hardware capabilities.

news.mit.edu/2010/gallager-codes-0121 explains how turbo codes were first reached without a very good mathematical proof behind them, but were still revolutionary in experimental performance, e.g. turbo codes were used in 3G/4G.

But this motivated researchers to find other such algorithms that they would be able to prove things about, and so they rediscovered the much earlier low-density parity-check code, which had been published in the 60's but was forgotten, partially because it was computationally expensive.