A Samsung printer specific compression routine. More...

Functions
static unsigned	drv_spl2_checksum_calc (size_t sz, const uint8_t src[sz])

uint8_t	drv_spl2_band_rotate (struct spl2_band_buffer *info)

static int	sort_offsets (const void n1, const void n2)

static size_t	offset_dictionary_limit (size_t element_cnt, struct dictionary_collection dc[element_cnt])

static int	sort_appearances (const void n1, const void n2)

static void	offset_dictionary_calc (struct spl2_band_buffer *info, size_t cnt, const uint8_t data[cnt])

static void	compressed_entry_write (uint8_t buffer[2], size_t cnt, size_t offset_idx)

static size_t	pattern_find_longest (struct spl2_band_buffer info, size_t cur_idx, size_t remain_length, size_t offset_idx)

int	drv_spl2_band_compress (struct spl2_band_buffer *info)

Detailed Description

Printer's data is encoded with some kind of dictionary. This dictionary contains the most used pattern found in the first 2 kiB of raw printing data. Since the printer prints in bands of 128 lines, this defines one data set to encode for the printer.

Don't know exactly why, but this algorithm is called 0x11.

Todo:

Count zero bits in all bands of a paper to calculate the print coverage

Check https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform for a better algorithm.

Function Documentation

◆ drv_spl2_checksum_calc()

static unsigned drv_spl2_checksum_calc	(	size_t	sz,
		const uint8_t	src[sz]
	)

static

Calculate a simple checksum over the given count of bytes

Parameters

[in]	sz	Byte count in `src`
[in]	src	Buffer to calculate the sum from

Returns: Simple checksum

According to the documentation it is a simple sum over all bytes including overrun in a single 32 bit number.

◆ drv_spl2_band_rotate()

uint8_t drv_spl2_band_rotate ( struct spl2_band_buffer * info )

Rotate the input data into the band buffer for further processing.

Parameters

[in] info All we must know about the raw data and the resulting band

Returns: 0xff if the whole band is empty, any value else

We got our data line by line. The band data must be in collumn by collumn. So, rotate the bytes (not the pixels!) for the band first.

This copies the data from the (line based) input buffer to the (collumn based) band buffer.

Precondition: band size must be large enough to hold all data from the input buffer e.g. at least 128 * line size

◆ sort_offsets()

static int sort_offsets	(	const void *	n1,
		const void *	n2
	)

static

Compare function#2 for the qsort() call

Parameters

[in]	n1	Entry to compare with `n2`
[in]	n2	Entry to compare with `n1`

Return values

negative	if n2 < n1
zero	if n1 == n2
positive	if n2 > n1

Return values are selected to let qsort() move entries with lower offset to the beginning of the dictionary.

◆ offset_dictionary_limit()

static size_t offset_dictionary_limit	(	size_t	element_cnt,
		struct dictionary_collection	dc[element_cnt]
	)

static

Limit the entry count in the dictionary

Parameters

[in]	element_cnt	The element count in the dictionary
[in,out]	dc	The dictionary

Returns: Valid element count in this dictionay

The element count of the compressed header's dictionary is limited to 64 entries. This routine limits the element count to these 64 entries or less than that, if dc has less than 64 valid entries. An entry is valid if its dictionary_collection::appearance member isn't 0. Since the dictionary is already sorted, the first member with dictionary_collection::appearance member equal to 0 defines the max. used entries.

While at it, all valid dictionary_collection::offset members get incremented by one to define the real negative offset.

In order to detect and compress repeating bytes, the dictionary_collection::offset '1' needs to be in the dictionary. This function ensures this offset is always present and moves it to the beginning of the dictionary to improve its detection.

Precondition: The dictionary must be already sorted, highest offset appearance at index '0'.

◆ sort_appearances()

static int sort_appearances	(	const void *	n1,
		const void *	n2
	)

static

Compare function for the qsort() call

Parameters

[in]	n1	Entry to compare with n2
[in]	n2	Entry to compare with n1

Return values

negative	if n2 < n1
zero	if n1 == n2
positive	if n2 > n1

Return values are selected to let qsort() move entries with larger appearance to the beginning of the dictionary.

◆ offset_dictionary_calc()

static void offset_dictionary_calc	(	struct spl2_band_buffer *	info,
		size_t	cnt,
		const uint8_t	data[cnt]
	)

static

Calculate some kind of dictionary for one band of printer data

Parameters

[in,out]	info	Information about the current band
[in]	cnt	Byte count in data
[in]	data	The data buffer used for the calculation

This routine searches in the first COMPRESS_SAMPLE_SIZE bytes of data for repeating characters. It starts with the character at offset (COMPRESS_SAMPLE_SIZE - 1) and searches for matches of this character backwards. Then it starts again at (COMPRESS_SAMPLE_SIZE - 2), until all characters are searched in such way. In the dictionary for each match it increments the appearance counter of the corresponding offset where the match happens. The dictionary then gets sorted, the offset with the most appearances first. At the end it gets limited to the first 64 entries (or less, if less valid entries are available).

Note: Measurements show, it doesn't improve the compression, if more than COMPRESS_SAMPLE_SIZE bytes are probed. With this dictionary it was possible to reduce a band data of size 79376 (DIN A4 at 600 DPI) down to 915 bytes.

Experiences show: it makes no sense to check every byte value again and again. So one improvement is to check every occuring byte value only once.

Precondition: cnt must be larger than COMPRESS_SAMPLE_SIZE

◆ compressed_entry_write()

static void compressed_entry_write	(	uint8_t	buffer[2],
		size_t	cnt,
		size_t	offset_idx
	)

static

Append a two byte repeating entry into the output buffer

Parameters

[in]	buffer	Where to store the entry
[in]	cnt	Duplicate byte count
[in]	offset_idx	The index into the header's offset table dictionary

A repeating entry just defines the amount of bytes to duplicate/repeat and an offset to copy this byte/these bytes from. The offset comes from the header's offset table. The offset defines the backward position into the area with already reconstructed bytes to copy the duplicated/repeated bytes from.

Encoding:

r(epeat counter): r8…r0, e.g. values of 0…511
i(ndex): i5…i0, e.g. values of 0…63

Bit 7  6  5  4  3  2  1  0   7  6  5  4  3  2  1  0
    1 r6 r5 r4 r3 r2 r1 r0  r8 r7 i5 i4 i3 i2 i1 i0

i(ndex) points into one entry in the header's dictionary_collection

Precondition: cnt must be in range 3...514; offset_idx must in range 0...63; buffer must be at least of size 2 bytes

◆ pattern_find_longest()

static size_t pattern_find_longest	(	struct spl2_band_buffer *	info,
		size_t	cur_idx,
		size_t	remain_length,
		size_t *	offset_idx
	)

static

Find the longest matching pattern in the band data based on the offsets in the dictionary

Parameters

[in,out]	info	Information about the current band
[in]	cur_idx	Current data position to encode in info->band
[in]	remain_length	Remaining bytes after `cur_idx` to compare
[out]	offset_idx	Dictionary index of the best (e.g. longest) matching pattern

Return values

Below	3, no useable pattern found
Above	3, matching and useable pattern found (`offset_idx` is then valid)

Precondition: remain_length must be greater than MIN_COMPRESSED_BYTES to make sense to call this function; spl2_band_buffer::dictionary_size must be greater than 0 to make sense to call this function

◆ drv_spl2_band_compress()

int drv_spl2_band_compress ( struct spl2_band_buffer * info )

Compress/encode one band of printer data

Parameters

[in,out] info Information about the current band

Returns: 0 on success (currently always)

Functions

Detailed Description

Function Documentation

◆ drv_spl2_checksum_calc()

◆ drv_spl2_band_rotate()

◆ sort_offsets()

◆ offset_dictionary_limit()

◆ sort_appearances()

◆ offset_dictionary_calc()

◆ compressed_entry_write()

◆ pattern_find_longest()

◆ drv_spl2_band_compress()