Unishox: A hybrid encoder for Short Unicode Strings
Unishox is a hybrid encoding technique with which short unicode strings could be compressed using context aware pre-mapped codes and delta coding resulting in surprisingly good ratios.
This article discusses a hybrid encoding method for compressing Short Unicode Strings of arbitrary lengths including Latin/English text and printable special characters. This has not been sufficiently addressed by lossless entropy encoding methods so far.
Although it appears inconsequential, space occupied by such strings be- come significant in memory constrained environments such as Arduino Uno and ESP8266. Text exchange in Chat applications is another area where cost sav- ings could be seen using such compression. It is also possible to achieve savings in bandwidth and storage cost by storing and retrieving independent strings in Cloud databases.