[e16e8f2] | 1 | |
---|
| 2 | ============================================================================ |
---|
| 3 | LZO -- a real-time data compression library |
---|
| 4 | ============================================================================ |
---|
| 5 | |
---|
| 6 | Author : Markus Franz Xaver Johannes Oberhumer |
---|
| 7 | <markus@oberhumer.com> |
---|
| 8 | http://www.oberhumer.com/opensource/lzo/ |
---|
| 9 | Version : 2.07 |
---|
| 10 | Date : 25 Jun 2014 |
---|
| 11 | |
---|
| 12 | |
---|
| 13 | Abstract |
---|
| 14 | -------- |
---|
| 15 | LZO is a portable lossless data compression library written in ANSI C. |
---|
| 16 | It offers pretty fast compression and very fast decompression. |
---|
| 17 | Decompression requires no memory. |
---|
| 18 | |
---|
| 19 | In addition there are slower compression levels achieving a quite |
---|
| 20 | competitive compression ratio while still decompressing at |
---|
| 21 | this very high speed. |
---|
| 22 | |
---|
| 23 | The LZO algorithms and implementations are copyrighted OpenSource |
---|
| 24 | distributed under the GNU General Public License. |
---|
| 25 | |
---|
| 26 | |
---|
| 27 | Introduction |
---|
| 28 | ------------ |
---|
| 29 | LZO is a data compression library which is suitable for data |
---|
| 30 | de-/compression in real-time. This means it favours speed |
---|
| 31 | over compression ratio. |
---|
| 32 | |
---|
| 33 | The acronym LZO is standing for Lempel-Ziv-Oberhumer. |
---|
| 34 | |
---|
| 35 | LZO is written in ANSI C. Both the source code and the compressed |
---|
| 36 | data format are designed to be portable across platforms. |
---|
| 37 | |
---|
| 38 | LZO implements a number of algorithms with the following features: |
---|
| 39 | |
---|
| 40 | - Decompression is simple and *very* fast. |
---|
| 41 | - Requires no memory for decompression. |
---|
| 42 | - Compression is pretty fast. |
---|
| 43 | - Requires 64 KiB of memory for compression. |
---|
| 44 | - Allows you to dial up extra compression at a speed cost in the |
---|
| 45 | compressor. The speed of the decompressor is not reduced. |
---|
| 46 | - Includes compression levels for generating pre-compressed |
---|
| 47 | data which achieve a quite competitive compression ratio. |
---|
| 48 | - There is also a compression level which needs only 8 KiB for compression. |
---|
| 49 | - Algorithm is thread safe. |
---|
| 50 | - Algorithm is lossless. |
---|
| 51 | |
---|
| 52 | LZO supports overlapping compression and in-place decompression. |
---|
| 53 | |
---|
| 54 | |
---|
| 55 | Design criteria |
---|
| 56 | --------------- |
---|
| 57 | LZO was designed with speed in mind. Decompressor speed has been |
---|
| 58 | favoured over compressor speed. Real-time decompression should be |
---|
| 59 | possible for virtually any application. The implementation of the |
---|
| 60 | LZO1X decompressor in optimized i386 assembler code runs about at |
---|
| 61 | the third of the speed of a memcpy() - and even faster for many files. |
---|
| 62 | |
---|
| 63 | In fact I first wrote the decompressor of each algorithm thereby |
---|
| 64 | defining the compressed data format, verified it with manually |
---|
| 65 | created test data and at last added the compressor. |
---|
| 66 | |
---|
| 67 | |
---|
| 68 | Performance |
---|
| 69 | ----------- |
---|
| 70 | To keep you interested, here is an overview of the average results |
---|
| 71 | when compressing the Calgary Corpus test suite with a blocksize |
---|
| 72 | of 256 KiB, originally done on an ancient Intel Pentium 133. |
---|
| 73 | |
---|
| 74 | The naming convention of the various algorithms goes LZOxx-N, where N is |
---|
| 75 | the compression level. Range 1-9 indicates the fast standard levels using |
---|
| 76 | 64 KiB memory for compression. Level 99 offers better compression at the |
---|
| 77 | cost of more memory (256 KiB), and is still reasonably fast. |
---|
| 78 | Level 999 achieves nearly optimal compression - but it is slow |
---|
| 79 | and uses much memory, and is mainly intended for generating |
---|
| 80 | pre-compressed data. |
---|
| 81 | |
---|
| 82 | The C version of LZO1X-1 is about 4-5 times faster than the fastest |
---|
| 83 | zlib compression level, and it also outperforms other algorithms |
---|
| 84 | like LZRW1-A and LZV in both compression ratio and compression speed |
---|
| 85 | and decompression speed. |
---|
| 86 | |
---|
| 87 | +------------------------------------------------------------------------+ |
---|
| 88 | | Algorithm Length CxB ComLen %Remn Bits Com K/s Dec K/s | |
---|
| 89 | | --------- ------ --- ------ ----- ---- ------- ------- | |
---|
| 90 | | | |
---|
| 91 | | memcpy() 224401 1 224401 100.0 8.00 60956.83 59124.58 | |
---|
| 92 | | | |
---|
| 93 | | LZO1-1 224401 1 117362 53.1 4.25 4665.24 13341.98 | |
---|
| 94 | | LZO1-99 224401 1 101560 46.7 3.73 1373.29 13823.40 | |
---|
| 95 | | | |
---|
| 96 | | LZO1A-1 224401 1 115174 51.7 4.14 4937.83 14410.35 | |
---|
| 97 | | LZO1A-99 224401 1 99958 45.5 3.64 1362.72 14734.17 | |
---|
| 98 | | | |
---|
| 99 | | LZO1B-1 224401 1 109590 49.6 3.97 4565.53 15438.34 | |
---|
| 100 | | LZO1B-2 224401 1 106235 48.4 3.88 4297.33 15492.79 | |
---|
| 101 | | LZO1B-3 224401 1 104395 47.8 3.83 4018.21 15373.52 | |
---|
| 102 | | LZO1B-4 224401 1 104828 47.4 3.79 3024.48 15100.11 | |
---|
| 103 | | LZO1B-5 224401 1 102724 46.7 3.73 2827.82 15427.62 | |
---|
| 104 | | LZO1B-6 224401 1 101210 46.0 3.68 2615.96 15325.68 | |
---|
| 105 | | LZO1B-7 224401 1 101388 46.0 3.68 2430.89 15361.47 | |
---|
| 106 | | LZO1B-8 224401 1 99453 45.2 3.62 2183.87 15402.77 | |
---|
| 107 | | LZO1B-9 224401 1 99118 45.0 3.60 1677.06 15069.60 | |
---|
| 108 | | LZO1B-99 224401 1 95399 43.6 3.48 1286.87 15656.11 | |
---|
| 109 | | LZO1B-999 224401 1 83934 39.1 3.13 232.40 16445.05 | |
---|
| 110 | | | |
---|
| 111 | | LZO1C-1 224401 1 111735 50.4 4.03 4883.08 15570.91 | |
---|
| 112 | | LZO1C-2 224401 1 108652 49.3 3.94 4424.24 15733.14 | |
---|
| 113 | | LZO1C-3 224401 1 106810 48.7 3.89 4127.65 15645.69 | |
---|
| 114 | | LZO1C-4 224401 1 105717 47.7 3.82 3007.92 15346.44 | |
---|
| 115 | | LZO1C-5 224401 1 103605 47.0 3.76 2829.15 15153.88 | |
---|
| 116 | | LZO1C-6 224401 1 102585 46.5 3.72 2631.37 15257.58 | |
---|
| 117 | | LZO1C-7 224401 1 101937 46.2 3.70 2378.57 15492.49 | |
---|
| 118 | | LZO1C-8 224401 1 100779 45.6 3.65 2171.93 15386.07 | |
---|
| 119 | | LZO1C-9 224401 1 100255 45.4 3.63 1691.44 15194.68 | |
---|
| 120 | | LZO1C-99 224401 1 97252 44.1 3.53 1462.88 15341.37 | |
---|
| 121 | | LZO1C-999 224401 1 87740 40.2 3.21 306.44 16411.94 | |
---|
| 122 | | | |
---|
| 123 | | LZO1F-1 224401 1 113412 50.8 4.07 4755.97 16074.12 | |
---|
| 124 | | LZO1F-999 224401 1 89599 40.3 3.23 280.68 16553.90 | |
---|
| 125 | | | |
---|
| 126 | | LZO1X-1(11) 224401 1 118810 52.6 4.21 4544.42 15879.04 | |
---|
| 127 | | LZO1X-1(12) 224401 1 113675 50.6 4.05 4411.15 15721.59 | |
---|
| 128 | | LZO1X-1 224401 1 109323 49.4 3.95 4991.76 15584.89 | |
---|
| 129 | | LZO1X-1(15) 224401 1 108500 49.1 3.93 5077.50 15744.56 | |
---|
| 130 | | LZO1X-999 224401 1 82854 38.0 3.04 135.77 16548.48 | |
---|
| 131 | | | |
---|
| 132 | | LZO1Y-1 224401 1 110820 49.8 3.98 4952.52 15638.82 | |
---|
| 133 | | LZO1Y-999 224401 1 83614 38.2 3.05 135.07 16385.40 | |
---|
| 134 | | | |
---|
| 135 | | LZO1Z-999 224401 1 83034 38.0 3.04 133.31 10553.74 | |
---|
| 136 | | | |
---|
| 137 | | LZO2A-999 224401 1 87880 40.0 3.20 301.21 8115.75 | |
---|
| 138 | +------------------------------------------------------------------------+ |
---|
| 139 | |
---|
| 140 | Notes: |
---|
| 141 | - CxB is the number of blocks |
---|
| 142 | - K/s is the speed measured in 1000 uncompressed bytes per second |
---|
| 143 | - the assembler decompressors are even faster |
---|
| 144 | |
---|
| 145 | |
---|
| 146 | Short documentation |
---|
| 147 | ------------------- |
---|
| 148 | LZO is a block compression algorithm - it compresses and decompresses |
---|
| 149 | a block of data. Block size must be the same for compression |
---|
| 150 | and decompression. |
---|
| 151 | |
---|
| 152 | LZO compresses a block of data into matches (a sliding dictionary) |
---|
| 153 | and runs of non-matching literals. LZO takes care about long matches |
---|
| 154 | and long literal runs so that it produces good results on highly |
---|
| 155 | redundant data and deals acceptably with non-compressible data. |
---|
| 156 | |
---|
| 157 | When dealing with incompressible data, LZO expands the input |
---|
| 158 | block by a maximum of 64 bytes per 1024 bytes input. |
---|
| 159 | |
---|
| 160 | I have verified LZO using such tools as valgrind and other memory checkers. |
---|
| 161 | And in addition to compressing gigabytes of files when tuning some parameters |
---|
| 162 | I have also consulted various 'lint' programs to spot potential portability |
---|
| 163 | problems. LZO is free of any known bugs. |
---|
| 164 | |
---|
| 165 | |
---|
| 166 | The algorithms |
---|
| 167 | -------------- |
---|
| 168 | There are too many algorithms implemented. But I want to support |
---|
| 169 | unlimited backward compatibility, so I will not reduce the LZO |
---|
| 170 | distribution in the future. |
---|
| 171 | |
---|
| 172 | As the many object files are mostly independent of each other, the |
---|
| 173 | size overhead for an executable statically linked with the LZO library |
---|
| 174 | is usually pretty low (just a few KiB) because the linker will only add |
---|
| 175 | the modules that you are actually using. |
---|
| 176 | |
---|
| 177 | I first published LZO1 and LZO1A in the Internet newsgroups |
---|
| 178 | comp.compression and comp.compression.research in March 1996. |
---|
| 179 | They are mainly included for compatibility reasons. The LZO2A |
---|
| 180 | decompressor is too slow, and there is no fast compressor anyway. |
---|
| 181 | |
---|
| 182 | My experiments have shown that LZO1B is good with a large blocksize |
---|
| 183 | or with very redundant data, LZO1F is good with a small blocksize or |
---|
| 184 | with binary data and that LZO1X is often the best choice of all. |
---|
| 185 | LZO1Y and LZO1Z are almost identical to LZO1X - they can achieve a |
---|
| 186 | better compression ratio on some files. |
---|
| 187 | Beware, your mileage may vary. |
---|
| 188 | |
---|
| 189 | |
---|
| 190 | Usage of the library |
---|
| 191 | -------------------- |
---|
| 192 | Despite of its size, the basic usage of LZO is really very simple. |
---|
| 193 | |
---|
| 194 | Let's assume you want to compress some data with LZO1X-1: |
---|
| 195 | A) compression |
---|
| 196 | * include <lzo/lzo1x.h> |
---|
| 197 | call lzo_init() |
---|
| 198 | compress your data with lzo1x_1_compress() |
---|
| 199 | * link your application with the LZO library |
---|
| 200 | B) decompression |
---|
| 201 | * include <lzo/lzo1x.h> |
---|
| 202 | call lzo_init() |
---|
| 203 | decompress your data with lzo1x_decompress() |
---|
| 204 | * link your application with the LZO library |
---|
| 205 | |
---|
| 206 | The program examples/simple.c shows a fully working example. |
---|
| 207 | See also LZO.FAQ for more information. |
---|
| 208 | |
---|
| 209 | |
---|
| 210 | Building LZO |
---|
| 211 | ------------ |
---|
| 212 | As LZO uses Autoconf+Automake+Libtool the building process under |
---|
| 213 | UNIX systems should be very unproblematic. Shared libraries are |
---|
| 214 | supported on many architectures as well. |
---|
| 215 | For detailed instructions see the file INSTALL. |
---|
| 216 | |
---|
| 217 | Please note that due to the design of the ELF executable format |
---|
| 218 | the performance of a shared library on i386 systems (e.g. Linux) |
---|
| 219 | is a little bit slower, so you may want to link your applications |
---|
| 220 | with the static version (liblzo2.a) anyway. |
---|
| 221 | |
---|
| 222 | For building under DOS, Win16, Win32, OS/2 and other systems |
---|
| 223 | take a look at the file B/00readme.txt. |
---|
| 224 | |
---|
| 225 | In case of troubles (like decompression data errors) try recompiling |
---|
| 226 | everything without optimizations - LZO may break the optimizer |
---|
| 227 | of your compiler. See the file BUGS. |
---|
| 228 | |
---|
| 229 | LZO is written in ANSI C. In particular this means: |
---|
| 230 | - your compiler must understand prototypes |
---|
| 231 | - your compiler must understand prototypes in function pointers |
---|
| 232 | - your compiler must correctly promote integrals ("value-preserving") |
---|
| 233 | - your preprocessor must implement #elif, #error and stringizing |
---|
| 234 | - you must have a conforming and correct <limits.h> header |
---|
| 235 | - you must have <stddef.h>, <string.h> and other ANSI C headers |
---|
| 236 | - you should have size_t and ptrdiff_t |
---|
| 237 | |
---|
| 238 | |
---|
| 239 | Portability |
---|
| 240 | ----------- |
---|
| 241 | I have built and tested LZO successfully on a variety of platforms |
---|
| 242 | including DOS (16 + 32 bit), Windows 3.x (16-bit), Win32, Win64, |
---|
| 243 | Linux, *BSD, HP-UX and many more. |
---|
| 244 | |
---|
| 245 | LZO is also reported to work under AIX, ConvexOS, IRIX, MacOS, PalmOS (Pilot), |
---|
| 246 | PSX (Sony Playstation), Solaris, SunOS, TOS (Atari ST) and VxWorks. |
---|
| 247 | Furthermore it is said that its performance on a Cray is superior |
---|
| 248 | to all other machines... |
---|
| 249 | |
---|
| 250 | And I think it would be much fun to translate the decompressors |
---|
| 251 | to Z-80 or 6502 assembly. |
---|
| 252 | |
---|
| 253 | |
---|
| 254 | The future |
---|
| 255 | ---------- |
---|
| 256 | Here is what I'm planning for the next months. No promises, though... |
---|
| 257 | |
---|
| 258 | - interfaces to .NET and Mono |
---|
| 259 | - interfaces to Perl, Java, Python, Delphi, Visual Basic, ... |
---|
| 260 | - improve documentation and API reference |
---|
| 261 | |
---|
| 262 | |
---|
| 263 | Some comments about the source code |
---|
| 264 | ----------------------------------- |
---|
| 265 | Be warned: the main source code in the 'src' directory is a |
---|
| 266 | real pain to understand as I've experimented with hundreds of slightly |
---|
| 267 | different versions. It contains many #if and some gotos, and |
---|
| 268 | is *completely optimized for speed* and not for readability. |
---|
| 269 | Code sharing of the different algorithms is implemented by stressing |
---|
| 270 | the preprocessor - this can be really confusing. Lots of marcos and |
---|
| 271 | assertions don't make things better. |
---|
| 272 | |
---|
| 273 | Nevertheless LZO compiles very quietly on a variety of |
---|
| 274 | compilers with the highest warning levels turned on, even |
---|
| 275 | in C++ mode. |
---|
| 276 | |
---|
| 277 | |
---|
| 278 | Copyright |
---|
| 279 | --------- |
---|
| 280 | LZO is Copyright (C) 1996-2014 Markus Franz Xaver Oberhumer |
---|
| 281 | All Rights Reserved. |
---|
| 282 | |
---|
| 283 | LZO is distributed under the terms of the GNU General Public License (GPL). |
---|
| 284 | See the file COPYING. |
---|
| 285 | |
---|
| 286 | Special licenses for commercial and other applications which |
---|
| 287 | are not willing to accept the GNU General Public License |
---|
| 288 | are available by contacting the author. |
---|
| 289 | |
---|
| 290 | |
---|
| 291 | |
---|