|
static unsigned | drv_spl2_checksum_calc (size_t sz, const uint8_t src[sz]) |
|
uint8_t | drv_spl2_band_rotate (struct spl2_band_buffer *info) |
|
static int | sort_offsets (const void *n1, const void *n2) |
|
static size_t | offset_dictionary_limit (size_t element_cnt, struct dictionary_collection dc[element_cnt]) |
|
static int | sort_appearances (const void *n1, const void *n2) |
|
static void | offset_dictionary_calc (struct spl2_band_buffer *info, size_t cnt, const uint8_t data[cnt]) |
|
static void | compressed_entry_write (uint8_t buffer[2], size_t cnt, size_t offset_idx) |
|
static size_t | pattern_find_longest (struct spl2_band_buffer *info, size_t cur_idx, size_t remain_length, size_t *offset_idx) |
|
int | drv_spl2_band_compress (struct spl2_band_buffer *info) |
|
Printer's data is encoded with some kind of dictionary. This dictionary contains the most used pattern found in the first 2 kiB of raw printing data. Since the printer prints in bands of 128 lines, this defines one data set to encode for the printer.
Don't know exactly why, but this algorithm is called 0x11.
- Todo:
Count zero bits in all bands of a paper to calculate the print coverage
Check https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform for a better algorithm.
static size_t offset_dictionary_limit |
( |
size_t |
element_cnt, |
|
|
struct dictionary_collection |
dc[element_cnt] |
|
) |
| |
|
static |
Limit the entry count in the dictionary
- Parameters
-
[in] | element_cnt | The element count in the dictionary |
[in,out] | dc | The dictionary |
- Returns
- Valid element count in this dictionay
The element count of the compressed header's dictionary is limited to 64 entries. This routine limits the element count to these 64 entries or less than that, if dc
has less than 64 valid entries. An entry is valid if its dictionary_collection::appearance member isn't 0. Since the dictionary is already sorted, the first member with dictionary_collection::appearance member equal to 0 defines the max. used entries.
While at it, all valid dictionary_collection::offset members get incremented by one to define the real negative offset.
In order to detect and compress repeating bytes, the dictionary_collection::offset '1' needs to be in the dictionary. This function ensures this offset is always present and moves it to the beginning of the dictionary to improve its detection.
- Precondition
- The dictionary must be already sorted, highest offset appearance at index '0'.
static void offset_dictionary_calc |
( |
struct spl2_band_buffer * |
info, |
|
|
size_t |
cnt, |
|
|
const uint8_t |
data[cnt] |
|
) |
| |
|
static |
Calculate some kind of dictionary for one band of printer data
- Parameters
-
[in,out] | info | Information about the current band |
[in] | cnt | Byte count in data |
[in] | data | The data buffer used for the calculation |
This routine searches in the first COMPRESS_SAMPLE_SIZE bytes of data
for repeating characters. It starts with the character at offset (COMPRESS_SAMPLE_SIZE - 1) and searches for matches of this character backwards. Then it starts again at (COMPRESS_SAMPLE_SIZE - 2), until all characters are searched in such way. In the dictionary for each match it increments the appearance counter of the corresponding offset where the match happens. The dictionary then gets sorted, the offset with the most appearances first. At the end it gets limited to the first 64 entries (or less, if less valid entries are available).
- Note
- Measurements show, it doesn't improve the compression, if more than COMPRESS_SAMPLE_SIZE bytes are probed. With this dictionary it was possible to reduce a band data of size 79376 (DIN A4 at 600 DPI) down to 915 bytes.
Experiences show: it makes no sense to check every byte value again and again. So one improvement is to check every occuring byte value only once.
- Precondition
cnt
must be larger than COMPRESS_SAMPLE_SIZE
static void compressed_entry_write |
( |
uint8_t |
buffer[2], |
|
|
size_t |
cnt, |
|
|
size_t |
offset_idx |
|
) |
| |
|
static |
Append a two byte repeating entry into the output buffer
- Parameters
-
[in] | buffer | Where to store the entry |
[in] | cnt | Duplicate byte count |
[in] | offset_idx | The index into the header's offset table dictionary |
A repeating entry just defines the amount of bytes to duplicate/repeat and an offset to copy this byte/these bytes from. The offset comes from the header's offset table. The offset defines the backward position into the area with already reconstructed bytes to copy the duplicated/repeated bytes from.
Encoding:
- r(epeat counter): r8…r0, e.g. values of 0…511
- i(ndex): i5…i0, e.g. values of 0…63
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
1 r6 r5 r4 r3 r2 r1 r0 r8 r7 i5 i4 i3 i2 i1 i0
i(ndex) points into one entry in the header's dictionary_collection
- Precondition
cnt
must be in range 3...514
-
offset_idx
must in range 0...63
-
buffer
must be at least of size 2 bytes