1 | |
---|
2 | ============================================================================ |
---|
3 | LZO -- a real-time data compression library |
---|
4 | ============================================================================ |
---|
5 | |
---|
6 | Author : Markus Franz Xaver Johannes Oberhumer |
---|
7 | <markus@oberhumer.com> |
---|
8 | http://www.oberhumer.com/opensource/lzo/ |
---|
9 | Version : 2.07 |
---|
10 | Date : 25 Jun 2014 |
---|
11 | |
---|
12 | |
---|
13 | Abstract |
---|
14 | -------- |
---|
15 | LZO is a portable lossless data compression library written in ANSI C. |
---|
16 | It offers pretty fast compression and very fast decompression. |
---|
17 | Decompression requires no memory. |
---|
18 | |
---|
19 | In addition there are slower compression levels achieving a quite |
---|
20 | competitive compression ratio while still decompressing at |
---|
21 | this very high speed. |
---|
22 | |
---|
23 | The LZO algorithms and implementations are copyrighted OpenSource |
---|
24 | distributed under the GNU General Public License. |
---|
25 | |
---|
26 | |
---|
27 | Introduction |
---|
28 | ------------ |
---|
29 | LZO is a data compression library which is suitable for data |
---|
30 | de-/compression in real-time. This means it favours speed |
---|
31 | over compression ratio. |
---|
32 | |
---|
33 | The acronym LZO is standing for Lempel-Ziv-Oberhumer. |
---|
34 | |
---|
35 | LZO is written in ANSI C. Both the source code and the compressed |
---|
36 | data format are designed to be portable across platforms. |
---|
37 | |
---|
38 | LZO implements a number of algorithms with the following features: |
---|
39 | |
---|
40 | - Decompression is simple and *very* fast. |
---|
41 | - Requires no memory for decompression. |
---|
42 | - Compression is pretty fast. |
---|
43 | - Requires 64 KiB of memory for compression. |
---|
44 | - Allows you to dial up extra compression at a speed cost in the |
---|
45 | compressor. The speed of the decompressor is not reduced. |
---|
46 | - Includes compression levels for generating pre-compressed |
---|
47 | data which achieve a quite competitive compression ratio. |
---|
48 | - There is also a compression level which needs only 8 KiB for compression. |
---|
49 | - Algorithm is thread safe. |
---|
50 | - Algorithm is lossless. |
---|
51 | |
---|
52 | LZO supports overlapping compression and in-place decompression. |
---|
53 | |
---|
54 | |
---|
55 | Design criteria |
---|
56 | --------------- |
---|
57 | LZO was designed with speed in mind. Decompressor speed has been |
---|
58 | favoured over compressor speed. Real-time decompression should be |
---|
59 | possible for virtually any application. The implementation of the |
---|
60 | LZO1X decompressor in optimized i386 assembler code runs about at |
---|
61 | the third of the speed of a memcpy() - and even faster for many files. |
---|
62 | |
---|
63 | In fact I first wrote the decompressor of each algorithm thereby |
---|
64 | defining the compressed data format, verified it with manually |
---|
65 | created test data and at last added the compressor. |
---|
66 | |
---|
67 | |
---|
68 | Performance |
---|
69 | ----------- |
---|
70 | To keep you interested, here is an overview of the average results |
---|
71 | when compressing the Calgary Corpus test suite with a blocksize |
---|
72 | of 256 KiB, originally done on an ancient Intel Pentium 133. |
---|
73 | |
---|
74 | The naming convention of the various algorithms goes LZOxx-N, where N is |
---|
75 | the compression level. Range 1-9 indicates the fast standard levels using |
---|
76 | 64 KiB memory for compression. Level 99 offers better compression at the |
---|
77 | cost of more memory (256 KiB), and is still reasonably fast. |
---|
78 | Level 999 achieves nearly optimal compression - but it is slow |
---|
79 | and uses much memory, and is mainly intended for generating |
---|
80 | pre-compressed data. |
---|
81 | |
---|
82 | The C version of LZO1X-1 is about 4-5 times faster than the fastest |
---|
83 | zlib compression level, and it also outperforms other algorithms |
---|
84 | like LZRW1-A and LZV in both compression ratio and compression speed |
---|
85 | and decompression speed. |
---|
86 | |
---|
87 | +------------------------------------------------------------------------+ |
---|
88 | | Algorithm Length CxB ComLen %Remn Bits Com K/s Dec K/s | |
---|
89 | | --------- ------ --- ------ ----- ---- ------- ------- | |
---|
90 | | | |
---|
91 | | memcpy() 224401 1 224401 100.0 8.00 60956.83 59124.58 | |
---|
92 | | | |
---|
93 | | LZO1-1 224401 1 117362 53.1 4.25 4665.24 13341.98 | |
---|
94 | | LZO1-99 224401 1 101560 46.7 3.73 1373.29 13823.40 | |
---|
95 | | | |
---|
96 | | LZO1A-1 224401 1 115174 51.7 4.14 4937.83 14410.35 | |
---|
97 | | LZO1A-99 224401 1 99958 45.5 3.64 1362.72 14734.17 | |
---|
98 | | | |
---|
99 | | LZO1B-1 224401 1 109590 49.6 3.97 4565.53 15438.34 | |
---|
100 | | LZO1B-2 224401 1 106235 48.4 3.88 4297.33 15492.79 | |
---|
101 | | LZO1B-3 224401 1 104395 47.8 3.83 4018.21 15373.52 | |
---|
102 | | LZO1B-4 224401 1 104828 47.4 3.79 3024.48 15100.11 | |
---|
103 | | LZO1B-5 224401 1 102724 46.7 3.73 2827.82 15427.62 | |
---|
104 | | LZO1B-6 224401 1 101210 46.0 3.68 2615.96 15325.68 | |
---|
105 | | LZO1B-7 224401 1 101388 46.0 3.68 2430.89 15361.47 | |
---|
106 | | LZO1B-8 224401 1 99453 45.2 3.62 2183.87 15402.77 | |
---|
107 | | LZO1B-9 224401 1 99118 45.0 3.60 1677.06 15069.60 | |
---|
108 | | LZO1B-99 224401 1 95399 43.6 3.48 1286.87 15656.11 | |
---|
109 | | LZO1B-999 224401 1 83934 39.1 3.13 232.40 16445.05 | |
---|
110 | | | |
---|
111 | | LZO1C-1 224401 1 111735 50.4 4.03 4883.08 15570.91 | |
---|
112 | | LZO1C-2 224401 1 108652 49.3 3.94 4424.24 15733.14 | |
---|
113 | | LZO1C-3 224401 1 106810 48.7 3.89 4127.65 15645.69 | |
---|
114 | | LZO1C-4 224401 1 105717 47.7 3.82 3007.92 15346.44 | |
---|
115 | | LZO1C-5 224401 1 103605 47.0 3.76 2829.15 15153.88 | |
---|
116 | | LZO1C-6 224401 1 102585 46.5 3.72 2631.37 15257.58 | |
---|
117 | | LZO1C-7 224401 1 101937 46.2 3.70 2378.57 15492.49 | |
---|
118 | | LZO1C-8 224401 1 100779 45.6 3.65 2171.93 15386.07 | |
---|
119 | | LZO1C-9 224401 1 100255 45.4 3.63 1691.44 15194.68 | |
---|
120 | | LZO1C-99 224401 1 97252 44.1 3.53 1462.88 15341.37 | |
---|
121 | | LZO1C-999 224401 1 87740 40.2 3.21 306.44 16411.94 | |
---|
122 | | | |
---|
123 | | LZO1F-1 224401 1 113412 50.8 4.07 4755.97 16074.12 | |
---|
124 | | LZO1F-999 224401 1 89599 40.3 3.23 280.68 16553.90 | |
---|
125 | | | |
---|
126 | | LZO1X-1(11) 224401 1 118810 52.6 4.21 4544.42 15879.04 | |
---|
127 | | LZO1X-1(12) 224401 1 113675 50.6 4.05 4411.15 15721.59 | |
---|
128 | | LZO1X-1 224401 1 109323 49.4 3.95 4991.76 15584.89 | |
---|
129 | | LZO1X-1(15) 224401 1 108500 49.1 3.93 5077.50 15744.56 | |
---|
130 | | LZO1X-999 224401 1 82854 38.0 3.04 135.77 16548.48 | |
---|
131 | | | |
---|
132 | | LZO1Y-1 224401 1 110820 49.8 3.98 4952.52 15638.82 | |
---|
133 | | LZO1Y-999 224401 1 83614 38.2 3.05 135.07 16385.40 | |
---|
134 | | | |
---|
135 | | LZO1Z-999 224401 1 83034 38.0 3.04 133.31 10553.74 | |
---|
136 | | | |
---|
137 | | LZO2A-999 224401 1 87880 40.0 3.20 301.21 8115.75 | |
---|
138 | +------------------------------------------------------------------------+ |
---|
139 | |
---|
140 | Notes: |
---|
141 | - CxB is the number of blocks |
---|
142 | - K/s is the speed measured in 1000 uncompressed bytes per second |
---|
143 | - the assembler decompressors are even faster |
---|
144 | |
---|
145 | |
---|
146 | Short documentation |
---|
147 | ------------------- |
---|
148 | LZO is a block compression algorithm - it compresses and decompresses |
---|
149 | a block of data. Block size must be the same for compression |
---|
150 | and decompression. |
---|
151 | |
---|
152 | LZO compresses a block of data into matches (a sliding dictionary) |
---|
153 | and runs of non-matching literals. LZO takes care about long matches |
---|
154 | and long literal runs so that it produces good results on highly |
---|
155 | redundant data and deals acceptably with non-compressible data. |
---|
156 | |
---|
157 | When dealing with incompressible data, LZO expands the input |
---|
158 | block by a maximum of 64 bytes per 1024 bytes input. |
---|
159 | |
---|
160 | I have verified LZO using such tools as valgrind and other memory checkers. |
---|
161 | And in addition to compressing gigabytes of files when tuning some parameters |
---|
162 | I have also consulted various 'lint' programs to spot potential portability |
---|
163 | problems. LZO is free of any known bugs. |
---|
164 | |
---|
165 | |
---|
166 | The algorithms |
---|
167 | -------------- |
---|
168 | There are too many algorithms implemented. But I want to support |
---|
169 | unlimited backward compatibility, so I will not reduce the LZO |
---|
170 | distribution in the future. |
---|
171 | |
---|
172 | As the many object files are mostly independent of each other, the |
---|
173 | size overhead for an executable statically linked with the LZO library |
---|
174 | is usually pretty low (just a few KiB) because the linker will only add |
---|
175 | the modules that you are actually using. |
---|
176 | |
---|
177 | I first published LZO1 and LZO1A in the Internet newsgroups |
---|
178 | comp.compression and comp.compression.research in March 1996. |
---|
179 | They are mainly included for compatibility reasons. The LZO2A |
---|
180 | decompressor is too slow, and there is no fast compressor anyway. |
---|
181 | |
---|
182 | My experiments have shown that LZO1B is good with a large blocksize |
---|
183 | or with very redundant data, LZO1F is good with a small blocksize or |
---|
184 | with binary data and that LZO1X is often the best choice of all. |
---|
185 | LZO1Y and LZO1Z are almost identical to LZO1X - they can achieve a |
---|
186 | better compression ratio on some files. |
---|
187 | Beware, your mileage may vary. |
---|
188 | |
---|
189 | |
---|
190 | Usage of the library |
---|
191 | -------------------- |
---|
192 | Despite of its size, the basic usage of LZO is really very simple. |
---|
193 | |
---|
194 | Let's assume you want to compress some data with LZO1X-1: |
---|
195 | A) compression |
---|
196 | * include <lzo/lzo1x.h> |
---|
197 | call lzo_init() |
---|
198 | compress your data with lzo1x_1_compress() |
---|
199 | * link your application with the LZO library |
---|
200 | B) decompression |
---|
201 | * include <lzo/lzo1x.h> |
---|
202 | call lzo_init() |
---|
203 | decompress your data with lzo1x_decompress() |
---|
204 | * link your application with the LZO library |
---|
205 | |
---|
206 | The program examples/simple.c shows a fully working example. |
---|
207 | See also LZO.FAQ for more information. |
---|
208 | |
---|
209 | |
---|
210 | Building LZO |
---|
211 | ------------ |
---|
212 | As LZO uses Autoconf+Automake+Libtool the building process under |
---|
213 | UNIX systems should be very unproblematic. Shared libraries are |
---|
214 | supported on many architectures as well. |
---|
215 | For detailed instructions see the file INSTALL. |
---|
216 | |
---|
217 | Please note that due to the design of the ELF executable format |
---|
218 | the performance of a shared library on i386 systems (e.g. Linux) |
---|
219 | is a little bit slower, so you may want to link your applications |
---|
220 | with the static version (liblzo2.a) anyway. |
---|
221 | |
---|
222 | For building under DOS, Win16, Win32, OS/2 and other systems |
---|
223 | take a look at the file B/00readme.txt. |
---|
224 | |
---|
225 | In case of troubles (like decompression data errors) try recompiling |
---|
226 | everything without optimizations - LZO may break the optimizer |
---|
227 | of your compiler. See the file BUGS. |
---|
228 | |
---|
229 | LZO is written in ANSI C. In particular this means: |
---|
230 | - your compiler must understand prototypes |
---|
231 | - your compiler must understand prototypes in function pointers |
---|
232 | - your compiler must correctly promote integrals ("value-preserving") |
---|
233 | - your preprocessor must implement #elif, #error and stringizing |
---|
234 | - you must have a conforming and correct <limits.h> header |
---|
235 | - you must have <stddef.h>, <string.h> and other ANSI C headers |
---|
236 | - you should have size_t and ptrdiff_t |
---|
237 | |
---|
238 | |
---|
239 | Portability |
---|
240 | ----------- |
---|
241 | I have built and tested LZO successfully on a variety of platforms |
---|
242 | including DOS (16 + 32 bit), Windows 3.x (16-bit), Win32, Win64, |
---|
243 | Linux, *BSD, HP-UX and many more. |
---|
244 | |
---|
245 | LZO is also reported to work under AIX, ConvexOS, IRIX, MacOS, PalmOS (Pilot), |
---|
246 | PSX (Sony Playstation), Solaris, SunOS, TOS (Atari ST) and VxWorks. |
---|
247 | Furthermore it is said that its performance on a Cray is superior |
---|
248 | to all other machines... |
---|
249 | |
---|
250 | And I think it would be much fun to translate the decompressors |
---|
251 | to Z-80 or 6502 assembly. |
---|
252 | |
---|
253 | |
---|
254 | The future |
---|
255 | ---------- |
---|
256 | Here is what I'm planning for the next months. No promises, though... |
---|
257 | |
---|
258 | - interfaces to .NET and Mono |
---|
259 | - interfaces to Perl, Java, Python, Delphi, Visual Basic, ... |
---|
260 | - improve documentation and API reference |
---|
261 | |
---|
262 | |
---|
263 | Some comments about the source code |
---|
264 | ----------------------------------- |
---|
265 | Be warned: the main source code in the 'src' directory is a |
---|
266 | real pain to understand as I've experimented with hundreds of slightly |
---|
267 | different versions. It contains many #if and some gotos, and |
---|
268 | is *completely optimized for speed* and not for readability. |
---|
269 | Code sharing of the different algorithms is implemented by stressing |
---|
270 | the preprocessor - this can be really confusing. Lots of marcos and |
---|
271 | assertions don't make things better. |
---|
272 | |
---|
273 | Nevertheless LZO compiles very quietly on a variety of |
---|
274 | compilers with the highest warning levels turned on, even |
---|
275 | in C++ mode. |
---|
276 | |
---|
277 | |
---|
278 | Copyright |
---|
279 | --------- |
---|
280 | LZO is Copyright (C) 1996-2014 Markus Franz Xaver Oberhumer |
---|
281 | All Rights Reserved. |
---|
282 | |
---|
283 | LZO is distributed under the terms of the GNU General Public License (GPL). |
---|
284 | See the file COPYING. |
---|
285 | |
---|
286 | Special licenses for commercial and other applications which |
---|
287 | are not willing to accept the GNU General Public License |
---|
288 | are available by contacting the author. |
---|
289 | |
---|
290 | |
---|
291 | |
---|