A while ago, we wrote about audio codecs, what they are and what the different varieties are.
Today, we’re going to tackle video codecs, focusing on the most popular one today: H.264.
Video codecs are similar to audio codecs in that they are ways of representing an analog medium—a series of pictures, in this case—in digital form while using an amount of resources that fits the task.
H.264 is the video codec that has, for now, found the optimal balance between picture quality and compression, and it’s adaptive to video of lesser or higher quality. That’s why it’s become so popular, particularly for streaming media like video conferencing. H.265, however, is gaining traction, and we’ll briefly cover that at the end. We also discuss the previous standard codec: H.263.
What’s in a name? H.264
H.264 is a standard with a complicated history that we don’t need to get into. A number of companies have patents that pertain to it. It’s also often known in conjunction with MPEG-4, e.g. H.264/MPEG-4 AVC. It was formally approved for the first time in March 2003.
H.264 is a standard developed by the International Telecommunications Union (ITU) in conjunction with the Motion Picture Experts Group (MPEG).
The oh-so-catchy name H.264 comes from its place in the numerous standards developed by the ITU. AVC stands for Advanced Video Coding. MPEG-4, as you might guess, is the name for the codec from the Motion Picture Experts Group, where H.264 is included in a larger standard as MPEG-4 Part 10. We’ll get to the important Scalable Video Coding (SVC) version in a bit, but let’s not complicate things too much right at the beginning.
Clear as mud?
Good, let’s move on to the good stuff.
Resource usage
Among common uses of computers, video is the most resource intensive. You might say: What about games? The fantastic rigs that gamers build, decked out in all the latest gear, are all the result of pushing video in higher quality.
Where technology stands right now and for the foreseeable future, video will need to be compressed.
Compared with H.263, which was the standard that was previously the most common, H.264 gives you the same picture quality at a third to half of the resource usage.
It eliminates redundant information that might otherwise clog your bandwidth. One way it does this (and the algorithms are very complex, so this is only one way) is like this: Picture a video of a house. A cat runs in front of the house.
Previously, each frame of that video would have been transmitted or recorded. H.264 is capable of eliminating the unchanging information (the house) and only recording changes (the cat).
This is an incredible amount of resource savings!
Remember, however, that even lossless video can be of poor quality if the original recording technology isn’t any good. If you record in 240p, the best codec won’t make your video not look totally pixellated.
How much of your bandwidth is going to be used up? It obviously depends on your precise deployment, but here are some general numbers taken from a 2014 Polycom white paper [pdf]. These numbers are for video conferencing using the H.264 High Profile, which will be explained shortly:
- 4SIF (roughly 480p SD, 704 x 480 px): 256-385 Kbps
- 720p HD (1280 x 720 px): 512-768 Kbps
- 1080p Full HD (1920 x 1080 px): 1024-1920 Kbps
So you can see that even with the excellent compression provided by H.264, video content still is quite resource intensive. Properly configuring the Quality of Service (QoS) settings is essential when installing a video conferencing system.
How did we get here? H.263
To understand why H.264 is such a big deal, it’s a good idea to look back at H.263, the standard that it supplanted.
H.263 is a product of the mid-90s, being first published in 1996. It became the standard for the first big wave of video conferencing products. Many systems still support it, because it was (and is) so widely used.
It was developed with telephony and video conferencing in mind, which meant the data was sent over the PSTN (public switched telephony network). Nowadays, all this data travels over the internet, and in fact H.263 was expanded for the internet, too.
But the limitations of the PSTN still shape the limitations of H.263.
H.264 was developed with the internet in mind, using the processing power of newer technology to significantly reduce the amount of bandwidth and storage that is needed to send video.
H.263 is now considered a legacy design, and hasn’t been updated since 2005.
Video conferencing with H.264
H.264 is an incredibly adaptable codec. It scales based on the quality of the original content.
There are many different profiles of the H.264 codec, which you’ll often see in spec sheets: H.264 Baseline, H.264 High Profile, etc. What distinguishes each is quite complicated, so we’re not going to get into the specifics, but knowing what each is used for gives you an idea of what H.264 does.
Here are three you’ll often see in video conferencing applications:
- 264 Baseline is for more casual applications and leads to more data loss, which makes it very efficient in terms of resource usage. You’ll often see it used for smartphones, which require more efficiency.
- 264 Main Profile includes more technologies, which keep data better. Modern smartphones, which are so much more powerful than previous generations, will often make use of Main Profile.
- 264 High Profile is the most common one used in video conferencing in formal situations with monitors, TVs or the like. It retains a very high standard of picture quality.
Ultimately, which profile you use depends on the technology you have available to you and what your internet connection is like.
What about Scalable Video Coding (H.264 SVC)?
Annex G of the H.264 standard is known as Scalable Video Coding or SVC. Sometimes you’ll see this as it’s own thing, but make no mistake: it’s part of the H.264 standard.
SVC is particularly useful in situations like when someone at a desk might be simultaneously chatting with someone on a mobile device and someone else using a large-screen monitor. In other words, H.264 SVC is useful when there is a diverse group of endpoints communicating with each other.
SVC is scalable. Essentially, without adding much in the way of resource load, it encodes the ability of endpoints to reproduce both higher and lower quality versions of the same video. So if someone is on a smartphone with a bad connection, they’ll get a workable stream of video in lower quality, while someone in the office on a stable broadband connection will get the same stream in high quality.
Instead of sending one homogeneous stream of video, SVC gives the endpoint the flexibility to display the video stream that fits its capabilities.
You can see why so many people get excited about SVC!
I’ve heard about H.265…
Of course, in the tech world, nothing sits still for long. H.264 is an excellent standard, but engineers are always trying to push the envelope.
What’s next? H.265, also known as High Efficiency Video Coding or HEVC, whose initial version was ratified in January 2013. You don’t see it much yet, but it’s coming!
H.265 is twice as efficient as H.264. Yes, twice.
This means you use half the resources for the same picture quality, or get a much better picture quality for the same resources.
Moreover, it’s looking to the future, because it’s adaptable for picture quality up to 8192 x 4320, commonly known as 8k. This is a picture quality that is barely seen at present.
H.265 works by being even more efficient and clever at finding ways to compress data without compromising picture quality. The one downside is that H.265 requires much more processing power that H.264, because of the diversity of methods used to be more efficient.
So you gain in bandwidth and storage efficiency while using more computing power with HEVC.
Still, with video being something like 80% of all the bandwidth usage on the internet, you can see why H.265 is going to be a big deal. And with video conferencing, while 1080p is still acceptable for many organizations, the advent of 4k video cameras will mean H.265 is going to filter into enterprises in the near future.