!
(Diagram created by me)
Compression is the process used to reduce the storage space required by a file, meaning you can store more files with the same amount of storage space. Compression is particularly important for sharing files over networks or the Internet. The larger a file, the longer it takes to transfer and so compressing files increases the number of files that can be transferred in a given time. Apps like Google Photos compress files so that they can quickly be searched for and downloaded. Downloading a compressed file over the Internet is faster than downloading the full version of the file.
Compression is the process used to reduce the storage space required by a file, meaning you can store more files with the same amount of storage space. Compression is particularly important for sharing files over networks or the Internet. The larger a file, the longer it takes to transfer and so compressing files increases the number of files that can be transferred in a given time. Apps like Google Photos compress files so that they can quickly be searched for and downloaded. Downloading a compressed file over the Internet is faster than downloading the full version of the file.
"Within the last decade the use of data compression has become ubiquitous. From mp3 players whose headphones seem to adorn the ears of most young (and some not so young) people, to cell phones, to DVDs, to digital television, data compression is an integral part of almost all information technology. This incorporation of compression into more and more of our lives also points to a certain degree of maturation of the technology." QUOTE FROM INTRODUCTION TO DATA COMPRESSION BY Khalid Sayood
The main advantages of compression are reductions in storage hardware, data transmission time, and communication bandwidth. This can result in significant cost savings. Compressed files require significantly less storage capacity than uncompressed files, meaning a significant decrease in expenses for storage. A compressed file also requires less time for transfer while consuming less network bandwidth. This can also help with costs, and also increases productivity. The main disadvantage of data compression is the increased use of computing resources to apply compression to the relevant data. Because of this, compression vendors prioritize speed and resource efficiency optimizations in order to minimize the impact of intensive compression tasks.
There are two categories of compression: lossy and lossless. As the name suggests, lossy compression reduces the size of a file while also removing some of its information. This could result in a more pixelated image or less clear audio recording. On the other hand, lossless compression reduces the size of a file without losing any information. When using lossless compression, the original file can be recovered from the compressed version. Something which is not possible when using lossy compression which reduces the size of the file by completely disregarding some information. For example, audio files can be compressed lossily by removing the very high or very low frequencies which are least noticeable to the ear. There’s no way to go from the lossy version of the recording back to the full version as there’s no record of what the high and low frequencies were.
Sometimes called RLE, run length encoding is a method of lossless compression in which repeated values are removed and replaced with one occurrence of the data followed by the number of times it should be repeated. For example, the string AAAAAABBBBBCCC could be represented as A6B5C3. In order to work well, run length encoding relies on consecutive pieces of data being the same - if there’s little repetition, run length encoding doesn’t offer a great reduction in file size.
Dictionary encoding is another example of a method of lossless compression. Frequently occurring pieces of data are replaced with an index and compressed data is stored alongside a dictionary which matches the frequently occurring data to an index. The original data can then be restored using the dictionary.
Take for example, the following passage.
We shall go on to the end.
We shall fight in France,
we shall fight on the seas and oceans,
we shall fight with growing confidence and growing strength in the air,
we shall defend our island, whatever the cost may be.
Frequently occurring phrases include “We shall”, “fight”, “the”, “on” and “in”. Placing these phrases in the dictionary below and replacing any occurrence with the phrase’s index, the size of the passage is substantially reduced.
(The above diagram was not created by me.)
It’s important to remember that data compressed using dictionary compression must be transferred alongside its dictionary . Without a dictionary, the data cannot be used.