-
Notifications
You must be signed in to change notification settings - Fork 268
.basis File Format and ETC1S Texture Video Specification
Version 1.03
4.1 "basis_file_header" structure
4.2 "basis_slice_desc" Structure
9.0 ETC1S Compressed Slice Decoding Huffman Tables
10.1 ETC1S Approximate Move to Front Routines
10.2 ETC1S VLC Decoding Procedure
10.3 ETC1S Slice Block Decoding
11.0 Alpha Channels in ETC1S Format Files
The .basis file format specification and ETC1S Texture/Texture Video Specification is explicitly not copyrighted by any entity, and to our knowledge is patent free. It may be used for any purpose whatsoever, including commercial purposes. The author of this work hereby waives all claim of copyright (economic and moral) in this work and immediately places it in the public domain; it may be used, distorted or destroyed in any manner whatsoever without further attribution or notice to the creator.
The Basis Universal GPU texture codec supports reading and writing ".basis" files. The .basis file format currently supports ETC1S or UASTC 4x4 texture data.
- ETC1S is a simplified subset of the Khronos ETC1 GPU texture format, which is very popular on Android.
In ETC1S, the mode is always differential (diff bit=1), the Rd, Gd, and Bd color deltas are always (0,0,0), and the flip bit is always set. ETC1S blocks are specified using the 15-bit 555 base color (called "color endpoints" in this specification, which is terminology derived from our BC1 texture format systems), the 3-bit intensity table index, and the 4x4 texel array of 2-bit selector indices (which are called "pixel index bits" in the Khronos ETC1 specification). ETC1S texture data is fully compliant with all existing software and hardware ETC1 decoders. Existing encoders can be easily modified to limit their output to ETC1S.
ETC1S format .basis files have built-in lossy data compression applied to the ETC1S block data, which is based off Vector Quantization. VQ is applied to the ETC1S color endpoint/intensity values and texel selectors, each treated as two separate vectors. The two codebooks are global and shared across all mipmap levels, cubemap faces, video frames, etc. There are two VQ codebook indices per block, which are compressed using a combination of canonical Huffman coding, Run-Length Encoding (RLE), DPCM coding, and an approximate Move to Front (MTF) transform.
- UASTC 4x4 is a 19 mode subset of the ASTC texture format. Its specification is here. UASTC texture data can always be losslessly transcoded to ASTC, or transcoded to BC7 with low loss (approximately .75-1.5 dB). UASTC textures are substantially higher quality than ETC1S textures, but are also significantly larger in memory (8-bpp vs. 4-bpp).
UASTC files may be encoded for highest quality, or may be optionally encoded using Rate Distortion Optimization (RDO) of the endpoint/intensity/selector bits. UASTC files encoded with an RDO encoder may then be losslessly compressed using any lossless data compression codec, such as Deflate, LZMA, Zstd, etc. This specification does not specify how UASTC .basis files should be further losslessly compressed (that is up to the end user).
.basis files containing UASTC texture data do not use global codebooks.
A .basis file consists of multiple sections. Apart from the header, which must always be at the start of the file, the other sections may appear in any order.
Here's the high level organization of a typical .basis file:
- The file header
- Optional ETC1S compressed endpoint/selector codebooks
- Optional ETC1S Huffman table information
- A required "slice" description array describing the resolutions and file offset/compressed sizes of each texture slice present in the file
- 1 or more slices containing ETC1S or UASTC compressed texture data.
- For future expansion, the format supports an "extended" header which may be located anywhere in the file. This section contains .PNG-like chunked data.
.basis files can contain 2D images/textures, 2D texture arrays, cubemap arrays, and video frames. All image/texture types support optional mipmaps and alpha channels. Volume textures are supported in the format, but the reference encoder doesn't support them yet.
For texture video, all frames are either I-frames (which can be completely decoded without referencing any previous frames), or P-frames (which reference the previous frame's decoded ETC1S contents). The current reference encoder always outputs an I-frame for the first frame, and all subsequent frames are P-frames.
// basis_file_header::m_tex_type
enum basis_texture_type
{
cBASISTexType2D = 0,
cBASISTexType2DArray = 1,
cBASISTexTypeCubemapArray = 2,
cBASISTexTypeVideoFrames = 3,
cBASISTexTypeVolume = 4,
cBASISTexTypeTotal
};
// basis_slice_desc::flags
enum basis_slice_desc_flags
{
cSliceDescFlagsHasAlpha = 1,
cSliceDescFlagsFrameIsIFrame = 2
};
// basis_file_header::m_tex_format
enum basis_tex_format
{
cETC1S = 0,
cUASTC4x4 = 1
};
// basis_file_header::m_flags
enum basis_header_flags
{
cBASISHeaderFlagETC1S = 1,
cBASISHeaderFlagYFlipped = 2,
cBASISHeaderFlagHasAlphaSlices = 4
};
All individual members in all file structures are byte aligned and little endian. The structs have no padding (i.e. they are declared with #pragma pack(1)).
The file header must always be at the beginning of the file.
struct basis_file_header
{
uint16 m_sig; // 2 byte file signature
uint16 m_ver; // File version
uint16 m_header_size; // Header size in bytes, sizeof(basis_file_header) or 0x4D
uint16 m_header_crc16; // CRC16/genibus of the remaining header data
uint32 m_data_size; // The total size of all data after the header
uint16 m_data_crc16; // The CRC16 of all data after the header
uint24 m_total_slices; // The number of compressed slices
uint24 m_total_images; // The total # of images
byte m_tex_format; // enum basis_tex_format
uint16 m_flags; // enum basis_header_flags
byte m_tex_type; // enum basis_texture_type
uint24 m_us_per_frame; // Video: microseconds per frame
uint32 m_reserved; // For future use
uint32 m_userdata0; // For client use
uint32 m_userdata1; // For client use
uint16 m_total_endpoints; // ETC1S: The number of endpoints in the endpoint codebook
uint32 m_endpoint_cb_file_ofs; // ETC1S: The compressed endpoint codebook's file offset relative to the header
uint24 m_endpoint_cb_file_size; // ETC1S: The compressed endpoint codebook's size in bytes
uint16 m_total_selectors; // ETC1S: The number of selectors in the selector codebook
uint32 m_selector_cb_file_ofs; // ETC1S: The compressed selector codebook's file offset relative to the header
uint24 m_selector_cb_file_size; // ETC1S: The compressed selector codebook's size in bytes
uint32 m_tables_file_ofs; // ETC1S: The file offset of the compressed Huffman codelength tables.
uint32 m_tables_file_size; // ETC1S: The file size in bytes of the compressed Huffman codelength tables.
uint32 m_slice_desc_file_ofs; // The file offset to the slice description array, usually follows the header
uint32 m_extended_file_ofs; // The file offset of the "extended" header and compressed data, for future use
uint32 m_extended_file_size; // The file size in bytes of the "extended" header and compressed data, for future use
};
4.1.1 Details:
-
m_sig
is always 'B' * 256 + 's', or 0x4273. -
m_ver
is currently always 0x10. -
m_header_size
issizeof(basis_file_header)
. It's always 0x4D. -
m_header_crc16
is the CRC-16 of the remaining header data. See the "CRC-16" section 5.0 below for more information. -
m_data_size
,m_data_crc16
: The size of all data following the header, and its CRC-16. -
m_total_slices
: The total number of slices, from [1,2^24-1] -
m_total_images
: The total number of images (where one image can contain multiple mipmap levels, and each mipmap level is a different slice). -
m_tex_format
: basis_tex_format. EithercETC1S
(0), orcUASTC4x4
(1). -
m_flags
: A combination of flags from the basis_header_flags enum. -
m_tex_type
: The texture type, fromenum basis_texture_type
-
m_us_per_frame
: Microseconds per frame, only valid forcBASISTexTypeVideoFrames
texture types. -
m_total_endpoints
,m_endpoint_cb_file_ofs
,m_endpoint_cb_file_size
: Information about the compressed ETC1S endpoint codebook: The total # of entries, the offset to the compressed data, and the compressed data's size. -
m_total_selectors
,m_selector_cb_file_ofs
,m_selector_cb_file_size
: Information about the compressed ETC1S selector codebook: The total # of entries, the offset to the compressed data, and the compressed data's size. -
m_tables_file_ofs
,m_tables_file_size
: The file offset and size of the compressed Huffman tables for ETC1S format files. -
m_slice_desc_file_ofs
: The file offset to the array of slice description structures. There will be m_total_slices structures at this file offset. -
m_extended_file_ofs
,m_extended_file_size
: The "extended" header, for future expansion. Currently unused.
struct basis_slice_desc
{
uint24 m_image_index;
uint8 m_level_index;
uint8 m_flags;
uint16 m_orig_width;
uint16 m_orig_height;
uint16 m_num_blocks_x;
uint16 m_num_blocks_y;
uint32 m_file_ofs;
uint32 m_file_size;
uint16 m_slice_data_crc16;
};
4.2.1 Details:
-
m_image_index
: The index of the source image provided to the encoder (will always appear in order from first to last, first image index is 0, no skipping allowed) -
m_level_index
: The mipmap level index (mipmaps will always appear from largest to smallest) -
m_flags
: Zero or moreenum basis_slice_desc_flags
logically OR'd together -
m_orig_width
: The original image width (may not be a multiple of 4 pixels) -
m_orig_height
: The original image height (may not be a multiple of 4 pixels) -
m_num_blocks_x
: The slice's block X dimensions. Each block is 4x4 pixels. The slice's pixel resolution may or may not be a power of 2. -
m_num_blocks_y
: The slice's block Y dimensions. -
m_file_ofs
: Offset from the header to the start of the slice's data -
m_file_size
: The size of the compressed slice data in bytes -
m_slice_data_crc16
: The CRC-16 of the slice data, for extra-paranoid use cases. For ETC1S, this is the CRC-16 of the uncompressed texture data. For UASTC, this is the CRC-16 of the UASTC texture data. It's optional for transcoders/decoders to validate this CRC-16.
.basis files use CRC-16/genibus (aka CRC-16 EPC, CRC-16 I-CODE, CRC-16 DARC) format CRC-16's.
Here's an example function in C++:
uint16_t crc16(const void* r, size_t size, uint16_t crc)
{
crc = ~crc;
const uint8_t* p = static_cast<const uint8_t*>(r);
for ( ; size; --size)
{
const uint16_t q = *p++ ^ (crc >> 8);
uint16_t k = (q >> 4) ^ q;
crc = (((crc << 8) ^ k) ^ (k << 5)) ^ (k << 12);
}
return static_cast<uint16_t>(~crc);
}
This function is called with 0 in the final "crc" parameter when computing CRC-16's of file data.
ETC1S format .basis files rely heavily on static canonical Huffman prefix coding. Many compressed sections use multiple Huffman tables. Huffman codes are stored in each output byte in LSB to MSB order. (This is opposite of the JPEG format, which stores the codes in MSB to LSB order.)
Huffman coding in .basis is compatible with the canonical Huffman methods used by Deflate encoders/decoders. Section 3.2.2 of Deflate - RFC 1951, which describes how to compute the value of each Huffman code given an array of symbol codelengths. This document assumes familiarity with how Huffman coding works in Deflate.
First, some enums:
enum
{
// Max supported Huffman code size is 16-bits
cHuffmanMaxSupportedCodeSize = 16,
// The maximum number of symbols is 2^14
cHuffmanMaxSymsLog2 = 14,
cHuffmanMaxSyms = 1 << cHuffmanMaxSymsLog2,
// Small zero runs may range from 3-10 entries
cHuffmanSmallZeroRunSizeMin = 3,
cHuffmanSmallZeroRunSizeMax = 10,
cHuffmanSmallZeroRunExtraBits = 3,
// Big zero runs may range from 11-138 entries
cHuffmanBigZeroRunSizeMin = 11,
cHuffmanBigZeroRunSizeMax = 138,
cHuffmanBigZeroRunExtraBits = 7,
// Small non-zero runs may range from 3-6 entries
cHuffmanSmallRepeatSizeMin = 3,
cHuffmanSmallRepeatSizeMax = 6,
cHuffmanSmallRepeatExtraBits = 2,
// Big non-zero run may range from 7-134 entries
cHuffmanBigRepeatSizeMin = 7,
cHuffmanBigRepeatSizeMax = 134,
cHuffmanBigRepeatExtraBits = 7,
// There are a maximum of 21 symbols in a compressed Huffman code length table.
cHuffmanTotalCodelengthCodes = 21,
// Symbols [0,16] indicate code sizes. Other symbols indicate zero runs or repeats:
cHuffmanSmallZeroRunCode = 17,
cHuffmanBigZeroRunCode = 18,
cHuffmanSmallRepeatCode = 19,
cHuffmanBigRepeatCode = 20
};
A .basis Huffman table consists of 1 to cHuffmanMaxSyms symbols. Each compressed Huffman table is described by an array of symbol code lengths in bits.
The table's symbol code lengths are themselves RLE+Huffman coded, just like Deflate. (Note this can be confusing to developers unfamiliar with Deflate.) Each table begins with a small fixed header:
14 bits: total_used_syms [1, cHuffmanMaxSyms]
5 bits: num_codelength_codes [1, cHuffmanTotalCodelengthCodes]
Next, the code lengths for the small Huffman table which is used to send the compressed codelengths (and RLE/repeat codes) are sent uncompressed but in a reordered manner:
3*num_codelength_codes bits: Code size of each Huffman symbol for the compressed Huffman
codelength table.
These code lengths are sent in this order (to help reduce the number that must be sent):
{
cHuffmanSmallZeroRunCode, cHuffmanBigZeroRunCode,
cHuffmanSmallRepeatCode, cHuffmanBigRepeatCode,
0, 8, 7, 9, 6, 0xA, 5, 0xB, 4, 0xC, 3, 0xD, 2, 0xE, 1, 0xF, 0x10
};
A canonical Huffman decoding table (of up to 21 symbols) should be built from these code lengths. Immediately following this data are the Huffman symbols (sometimes intermixed with raw bits) which describe how to unpack the codelengths of each symbol in the Huffman table:
- Symbols [0,16] indicate a specific symbol code length in bits.
- Symbol cHuffmanSmallZeroRunCode (17) indicates a short run of symbols with 0 bit code lengths.
cHuffmanSmallZeroRunExtraBits (3) bits are sent after this symbol, which indicates the run's
size after adding the minimum size (cHuffmanSmallZeroRunSizeMin).
- Symbol cHuffmanBigZeroRunCode (18) indicates a long run of symbols with 0 bit code lengths.
cHuffmanBigZeroRunExtraBits (7) bits are sent after this symbol, which indicates the run's
size after adding the minimum size (cHuffmanBigZeroRunSizeMin)
- Symbol cHuffmanSmallRepeatCode (19) indicates a short run of symbols that repeat the previous
symbol's code length. cHuffmanSmallRepeatExtraBits (2) bits are sent after this symbol, which
indicates the number of times to repeat the previous symbol's code length, after adding the
minimum size (cHuffmanSmallRepeatSizeMin). Cannot be the first symbol, and the previous symbol
cannot have a code length of 0.
- Symbol cHuffmanBigRepeatCode (20) indicates a short run of symbols that repeat the previous
symbol's code length. cHuffmanBigRepeatExtraBits (7) bits are sent after this symbol, which
indicates the number of times to repeat the previous symbol's code length, after adding the
minimum size (cHuffmanBigRepeatSizeMin). Cannot be the first symbol, and the previous symbol
cannot have a code length of 0.
There should be exactly total_used_syms code lengths stored in the compressed Huffman table. If not the stream is either corrupted or invalid.
After all the symbol codelengths are uncompressed, the symbol codes can be computed and the canonical Huffman decoding tables can be built.
Note: It's possible, just like in Deflate, for a Huffman table to be incomplete containing only a single symbol with a code length of 1 bit. In this case, this symbol will be assigned a code of "0".
The endpoint codebook section starts at file offset
basis_file_header::m_endpoint_cb_file_ofs
and is m_endpoint_cb_file_size
bytes
long. The endpoint codebook will have basis_file_header::m_total_endpoints
total
entries.
At the beginning of the compressed endpoint codebook section are four compressed Huffman tables, stored using the procedure outlined in section 6.0. The Huffman tables appear in this order:
1. color5_delta_model0
2. color5_delta_model1
3. color5_delta_model2
4. inten_delta_model
Following the data for these Huffman tables is a single 1-bit code which indicates if the color endpoint codebook is grayscale or not.
Immediately following this code is the compressed color endpoint codebook data. A simple form of DPCM (Delta Pulse Code Modulation) coding is used to send the ETC1S intensity table indices and color values. Here is the procedure to decode the endpoint codebook:
const int COLOR5_PAL0_PREV_HI = 9, COLOR5_PAL0_DELTA_LO = -9, COLOR5_PAL0_DELTA_HI = 31;
const int COLOR5_PAL1_PREV_HI = 21, COLOR5_PAL1_DELTA_LO = -21, COLOR5_PAL1_DELTA_HI = 21;
const int COLOR5_PAL2_PREV_HI = 31, COLOR5_PAL2_DELTA_LO = -31, COLOR5_PAL2_DELTA_HI = 9;
// Assume previous endpoint color is (16, 16, 16), and the previous intensity is 0.
color32 prev_color5(16, 16, 16, 0);
uint32_t prev_inten = 0;
// For each endpoint codebook entry
for (uint32_t i = 0; i < num_endpoints; i++)
{
// Decode the intensity delta Huffman code
uint32_t inten_delta = decode_huffman(inten_delta_model);
endpoints[i].m_inten5 = static_cast<uint8_t>((inten_delta + prev_inten) & 7);
prev_inten = endpoints[i].m_inten5;
// Now decode the endpoint entry's color or intensity value
for (uint32_t c = 0; c < (endpoints_are_grayscale ? 1U : 3U); c++)
{
// The Huffman table used to decode the delta depends on the previous color's value
int delta;
if (prev_color5[c] <= basist::COLOR5_PAL0_PREV_HI)
delta = decode_huffman(color5_delta_model0);
else if (prev_color5[c] <= basist::COLOR5_PAL1_PREV_HI)
delta = decode_huffman(color5_delta_model1);
else
delta = decode_huffman(color5_delta_model2);
// Apply the delta
int v = (prev_color5[c] + delta) & 31;
endpoints[i].m_color5[c] = static_cast<uint8_t>(v);
prev_color5[c] = static_cast<uint8_t>(v);
}
// If the endpoints are grayscale, set G and B to match R.
if (endpoints_are_grayscale)
{
endpoints[i].m_color5[1] = endpoints[i].m_color5[0];
endpoints[i].m_color5[2] = endpoints[i].m_color5[0];
}
}
The rest of the section's data (if any) can be ignored.
The selector codebook section starts at file offset
basis_file_header::m_selector_cb_file_ofs
and is m_selector_cb_file_size
bytes
long. The selector codebook will have basis_file_header::m_total_selectors
total
entries.
The first bit of this section indicates if "global" selector codebooks are used. Basis Universal doesn't currently utilize global selector codebooks, so this bit should always be 0.
The second bit of this section indicates if "hybrid" global/local selector codebooks are used. Hybrid codebooks are not supported either, so this bit should always be 0.
The third bit indicates if the selector codebook has been sent in raw form (uncompressed). If it's set, each selector is sent as four 8-bit bytes. Each byte corresponds to four 2-bit ETC1S selectors. The first selector of each group of 4 selectors starts at the LSB (least significant bit) of each byte, and is 2-bits wide.
If the third bit is 0, the selectors have been DPCM coded with Huffman coding. The "delta_selector_pal_model" Huffman table will immediately follow the third bit, and is stored using the procedure outlined in section 6.0.
Immediately following the Huffman table is the compressed selector codebook. Here is the DPCM decoding procedure:
uint8_t prev_bytes[4] = { 0, 0, 0, 0 };
for (uint32_t i = 0; i < num_selectors; i++)
{
if (!i)
{
// First selector is sent raw
for (uint32_t j = 0; j < 4; j++)
{
uint32_t cur_byte = get_bits(8);
prev_bytes[j] = static_cast<uint8_t>(cur_byte);
for (uint32_t k = 0; k < 4; k++)
selectors[i].set_selector(k, j, (cur_byte >> (k * 2)) & 3);
}
selectors[i].init_flags();
continue;
}
// Subsequent selectors are sent with a simple form of byte-wise DPCM coding.
for (uint32_t j = 0; j < 4; j++)
{
int delta_byte = decode_huffman(delta_selector_pal_model);
uint32_t cur_byte = delta_byte ^ prev_bytes[j];
prev_bytes[j] = static_cast<uint8_t>(cur_byte);
for (uint32_t k = 0; k < 4; k++)
selectors[i].set_selector(k, j, (cur_byte >> (k * 2)) & 3);
}
}
Any bytes in this section following the selector codebook bits can be safely ignored.
Each ETC1S slice is compressed with four Huffman tables stored using the
procedure outlined in section 6.0. These Huffman tables are stored at file
offset basis_file_header::m_tables_file_ofs
. This section will be
basis_file_header::m_tables_file_size
bytes long.
The following four Huffman tables are sent, in this order:
1. endpoint_pred_model
2. delta_endpoint_model
3. selector_model
4. selector_history_buf_rle_model
Following the last Huffman table are 13-bits indicating the size of the selector history buffer. Any remaining bits may be safely ignored.
ETC1S slices consist of a compressed 2D array of ETC1S blocks, always compressed in top-down/left-right raster order. For texture video, the previous slice's already decoded contents may be referred to when blocks are encoded using Conditional Replenishment (also known as "skip blocks").
Each ETC1S block is encoded by using references to the color endpoint codebook and the selector codebook. Sections 10.1 and 10.2 describe the helper procedures using by the decoder, and section 10.3 describes how the array of ETC1S blocks is actually decoded.
An approximate Move to Front (MTF) approach is used to efficiently encode the selector codebook references. Here is the C++ example class for approximate MTF decoding:
class approx_move_to_front
{
public:
approx_move_to_front(uint32_t n)
{
init(n);
}
void init(uint32_t n)
{
m_values.resize(n);
m_rover = n / 2;
}
size_t size() const { return m_values.size(); }
const int& operator[] (uint32_t index) const { return m_values[index]; }
int operator[] (uint32_t index) { return m_values[index]; }
void add(int new_value)
{
m_values[m_rover++] = new_value;
if (m_rover == m_values.size())
m_rover = (uint32_t)m_values.size() / 2;
}
void use(uint32_t index)
{
if (index)
{
int x = m_values[index / 2];
int y = m_values[index];
m_values[index / 2] = y;
m_values[index] = x;
}
}
private:
std::vector<int> m_values;
uint32_t m_rover;
};
ETC1S slice decoding utilizes a simple Variable Length Coding (VLC) scheme that sends raw bits using variable-size chunks. Here is the VLC decoding procedure:
uint32_t decode_vlc(uint32_t chunk_bits)
{
assert(chunk_bits);
const uint32_t chunk_size = 1 << chunk_bits;
const uint32_t chunk_mask = chunk_size - 1;
uint32_t v = 0;
uint32_t ofs = 0;
for ( ; ; )
{
uint32_t s = get_bits(chunk_bits + 1);
v |= ((s & chunk_mask) << ofs);
ofs += chunk_bits;
if ((s & chunk_size) == 0)
break;
if (ofs >= 32)
{
assert(0);
break;
}
}
return v;
}
Each slice has a corresponding basis_slice_desc
structure, described in section
4.2. The slice's dimensions in ETC1S blocks are stored in
basis_slice_desc::m_num_blocks_x
and basis_slice_desc::m_num_blocks_y
. Each
slice is located at file offset basis_slice_desc::m_file_ofs
, and is
basis_slice_desc::m_file_size bytes
long.
The decoder iterates through all the slice blocks in top-down, left-right raster order. Each block is represented by an index into the color endpoint codebook and another index into the selector endpoint codebook. The endpoint codebook contains each ETC1S block's base RGB color and intensity table information, and the selector codebook contains the 4x4 texel selector entry (which are 2-bits each) information. This is all the information needed to fully represent the texels within each block.
The decoding procedure loops over all the blocks in raster order, and decodes the endpoint and selector indices used to represent each block. The decoding procedure is complex enough that commented code is best used to describe it.
Here's the slice decoding procedure. This block of code shows the block loop, and how endpoint codebook indices are decoded. The next block of code shows how selector codebook indices are decoded.
// Constants used by the decoder
const uint32_t ENDPOINT_PRED_TOTAL_SYMBOLS = (4 * 4 * 4 * 4) + 1;
const uint32_t ENDPOINT_PRED_REPEAT_LAST_SYMBOL = ENDPOINT_PRED_TOTAL_SYMBOLS - 1;
const uint32_t ENDPOINT_PRED_MIN_REPEAT_COUNT = 3;
const uint32_t ENDPOINT_PRED_COUNT_VLC_BITS = 4;
const uint32_t NUM_ENDPOINT_PREDS = 3;
const uint32_t CR_ENDPOINT_PRED_INDEX = NUM_ENDPOINT_PREDS - 1;
const uint32_t NO_ENDPOINT_PRED_INDEX = 3;
// Endpoint/selector codebooks - decoded previously. See sections 7.0 and 8.0.
endpoint endpoints[endpoint_codebook_size];
selector selectors[selector_codebook_size];
// Array of per-block values used for endpoint index prediction (enough for 2 rows).
struct block_preds
{
uint16_t m_endpoint_index;
uint8_t m_pred_bits;
};
block_preds block_endpoint_preds[2][num_blocks_x];
// Some constants and state used during block decoding
const uint32_t SELECTOR_HISTORY_BUF_FIRST_SYMBOL_INDEX = selector_codebook_size;
const uint32_t SELECTOR_HISTORY_BUF_RLE_SYMBOL_INDEX = selector_history_buf_size + SELECTOR_HISTORY_BUF_FIRST_SYMBOL_INDEX;
uint32_t cur_selector_rle_count = 0;
uint32_t cur_pred_bits = 0;
int prev_endpoint_pred_sym = 0;
int endpoint_pred_repeat_count = 0;
uint32_t prev_endpoint_index = 0;
// This array is only used for texture video. It holds the previous frame's endpoint and selector indices (each 16-bits, for 32-bits total).
uint32_t prev_frame_indices[num_blocks_x][num_blocks_y];
// Selector history buffer - See section 10.1.
// For the selector history buffer's size, see section 9.0.
approx_move_to_front selector_history_buf(selector_history_buf_size);
// Loop over all slice blocks in raster order
for (uint32_t block_y = 0; block_y < num_blocks_y; block_y++)
{
// The index into the block_endpoint_preds array
const uint32_t cur_block_endpoint_pred_array = block_y & 1;
for (uint32_t block_x = 0; block_x < num_blocks_x; block_x++)
{
// Check if we're at the start of a 2x2 block group.
if ((block_x & 1) == 0)
{
// Are we on an even or odd row of blocks?
if ((block_y & 1) == 0)
{
// We're on an even row and column of blocks. Decode the combined endpoint index predictor symbols for 2x2 blocks.
// This symbol tells the decoder how the endpoints are decoded for each block in a 2x2 group of blocks.
// Are we in an RLE run?
if (endpoint_pred_repeat_count)
{
// Inside a run of endpoint predictor symbols.
endpoint_pred_repeat_count--;
cur_pred_bits = prev_endpoint_pred_sym;
}
else
{
// Decode the endpoint prediction symbol, using the "endpoint pred" Huffman table (see section 9.0).
cur_pred_bits = decode_huffman(m_endpoint_pred_model);
if (cur_pred_bits == ENDPOINT_PRED_REPEAT_LAST_SYMBOL)
{
// It's a run of symbols, so decode the count using VLC decoding (see section 10.2)
endpoint_pred_repeat_count = decode_vlc(ENDPOINT_PRED_COUNT_VLC_BITS) + ENDPOINT_PRED_MIN_REPEAT_COUNT - 1;
cur_pred_bits = prev_endpoint_pred_sym;
}
else
{
// It's not a run of symbols
prev_endpoint_pred_sym = cur_pred_bits;
}
}
// The symbol has enough endpoint prediction information for 4 blocks (2 bits per block), so 8 bits total.
// Remember the prediction information we should use for the next row of 2 blocks beneath the current block.
block_endpoint_preds[cur_block_endpoint_pred_array ^ 1][block_x].m_pred_bits = (uint8_t)(cur_pred_bits >> 4);
}
else
{
// We're on an odd row of blocks, so use the endpoint prediction information we previously stored on the previous even row.
cur_pred_bits = block_endpoint_preds[cur_block_endpoint_pred_array][block_x].m_pred_bits;
}
}
// Decode the current block's endpoint and selector indices.
uint32_t endpoint_index, selector_index = 0;
// Get the 2-bit endpoint prediction index for this block.
const uint32_t pred = cur_pred_bits & 3;
// Get the next block's endpoint prediction bits ready.
cur_pred_bits >>= 2;
// Now check to see if we should reuse a previously encoded block's endpoints.
if (pred == 0)
{
// Reuse the left block's endpoint index
assert(block_x > 0);
endpoint_index = prev_endpoint_index;
}
else if (pred == 1)
{
// Reuse the upper block's endpoint index
assert(block_y > 0)
endpoint_index = block_endpoint_preds[cur_block_endpoint_pred_array ^ 1][block_x].m_endpoint_index;
}
else if (pred == 2)
{
if (is_video)
{
// If it's texture video, reuse the previous frame's endpoint index, at this block.
assert(pred == CR_ENDPOINT_PRED_INDEX);
endpoint_index = prev_frame_indices[block_x][block_y];
selector_index = endpoint_index >> 16;
endpoint_index &= 0xFFFFU;
}
else
{
// Reuse the upper left block's endpoint index.
assert((block_x > 0) && (block_y > 0));
endpoint_index = block_endpoint_preds[cur_block_endpoint_pred_array ^ 1][block_x - 1].m_endpoint_index;
}
}
else
{
// We need to decode and apply a DPCM encoded delta to the previously used endpoint index.
// This uses the delta endpoint Huffman table (see section 9.0).
const uint32_t delta_sym = decode_huffman(delta_endpoint_model);
endpoint_index = delta_sym + prev_endpoint_index;
// Wrap around if the index goes beyond the end of the endpoint codebook
if (endpoint_index >= endpoints.size())
endpoint_index -= (int)endpoints.size();
}
// Remember the endpoint index we used on this block, so the next row can potentially reuse the index.
block_endpoint_preds[cur_block_endpoint_pred_array][block_x].m_endpoint_index = (uint16_t)endpoint_index;
// Remember the endpoint index used
prev_endpoint_index = endpoint_index;
// Now we have fully decoded the ETC1S endpoint codebook index, in endpoint_index.
// Now decode the selector index (see the next block of code, below).
< selector decoding - see below >
} // block_x
} // block_y
The compressed format allows the encoder to reuse the endpoint index used by the previous block, the block immediately above the current block, or the block to the upper left (if the file is not texture video). Alternately, the encoder can send a Huffman coded DPCM encoded index relative to the previously used endpoint index.
Which type of prediction was used by the encoder is controlled by the "endpoint pred" (endpoint prediction) indices, which are sent with Huffman coding (using the "endpoint_pred_model" table described in Section 9.0) once every 2x2 blocks.
For texture video, the endpoint prediction symbol normally used to refer to the upper left block (endpoint pred index 2) instead indicates that both the endpoint and selector indices from the previous frame's block should be reused on the current frame's block. The endpoint pred indices are RLE coded, so this allows the encoder to efficiently skip over a large number of unchanged blocks in a video sequence.
The code to decode the selector codebook index immediately follows the code above for decoding the endpoint indices:
const uint32_t MAX_SELECTOR_HISTORY_BUF_SIZE = 64;
const uint32_t SELECTOR_HISTORY_BUF_RLE_COUNT_THRESH = 3;
const uint32_t SELECTOR_HISTORY_BUF_RLE_COUNT_BITS = 6;
const uint32_t SELECTOR_HISTORY_BUF_RLE_COUNT_TOTAL = (1 << SELECTOR_HISTORY_BUF_RLE_COUNT_BITS);
// Decode selector index, unless it's texture video and the endpoint predictor indicated that the
// block's endpoints were reused from the previous frame.
if ((!is_video) || (pred != CR_ENDPOINT_PRED_INDEX))
{
int selector_sym;
// Are we in a selector RLE run?
if (cur_selector_rle_count > 0)
{
// Handle selector RLE run.
cur_selector_rle_count--;
selector_sym = (int)selectors.size();
}
else
{
// Decode the selector symbol, using the selector Huffman table (see section 9.0).
selector_sym = decode_huffman(m_selector_model);
// Is it a run?
if (selector_sym == static_cast<int>(SELECTOR_HISTORY_BUF_RLE_SYMBOL_INDEX))
{
// Decode the selector run's size, using the selector history buf RLE Huffman table (see section 9.0).
int run_sym = decode_huffman(selector_history_buf_rle_model);
// Is it a very long run?
if (run_sym == (SELECTOR_HISTORY_BUF_RLE_COUNT_TOTAL - 1))
cur_selector_rle_count = decode_vlc(7) + SELECTOR_HISTORY_BUF_RLE_COUNT_THRESH;
else
cur_selector_rle_count = run_sym + SELECTOR_HISTORY_BUF_RLE_COUNT_THRESH;
selector_sym = (int)selectors.size();
cur_selector_rle_count--;
}
}
// Is it a reference into the selector history buffer?
if (selector_sym >= (int)selectors.size())
{
assert(m_selector_history_buf_size > 0);
// Compute the history buffer index
int history_buf_index = selector_sym - (int)selectors.size();
assert(history_buf_index < selector_history_buf.size());
// Access the history buffer
selector_index = selector_history_buf[history_buf_index];
// Update the history buffer
if (history_buf_index != 0)
selector_history_buf.use(history_buf_index);
}
else
{
// It's an index into the selector codebook
selector_index = selector_sym;
// Add it to the selector history buffer
if (m_selector_history_buf_size)
selector_history_buf.add(selector_index);
}
}
// For texture video, remember the endpoint and selector indices used by the block on this frame, for later reuse on the next frame.
if (is_video)
prev_frame_indices[block_x][block_y] = endpoint_index | (selector_index << 16);
// The block is fully decoded here. The codebook indices are endpoint_index and selector_index.
// Make sure they are valid
assert((endpoint_index < endpoints.size()) && (selector_index < selectors.size()));
At this point, the decoder has decoded each block's endpoint and selector codebook indices. It can now fetch the actual ETC1S endpoints/selectors from the codebooks and write out ETC1S texture data, or it can immediately transcode the ETC1S data to another GPU texture format.
ETC1S .basis files can have optional alpha channels, stored in odd slices. If any slice needs an alpha channel,
all slices must have alpha channels. basis_file_header::m_flags will be logically OR'd with
cBASISHeaderFlagHasAlphaSlices
. Alpha channel ETC1S files will contain two slices for each mipmap level
(or face, or video frame, etc.). The basis_slice_desc::m_flags field will be logically OR'd with
cSliceDescFlagsHasAlpha
for all odd alpha slices.
The even slices will contain the RGB data, and the odd slices will contain the alpha data, both stored in ETC1S format. Alpha channel ETC1S files must always have an even total number of slices. A decoder can first decode the RGB data slice, then the next alpha channel slice, or it can decode them in parallel using multithreading. The ETC1S green channel (on the odd slices) contains the alpha values.
Both ETC1S and UASTC format files support texture video. Texture video files are basically long 2D texture arrays with two different frames types ("I" and "P" frames), and conditional replenishment (sometimes called "skip blocks"). The basis_file_header::m_tex_format
field in the file header will be set to cBASISTexTypeVideoFrames
. The basis_file_header::m_us_per_frame
field, if non-zero, indicates the suggested framerate. Texture video files can be optionally mipmapped, and can contain optional alpha channels (stored as separate slices in ETC1S format files).
Currently, the first frame is always an I-frame, and all subsequent frames are P-frames, but the file format and transcoder supports any frame being an I-frame (and the encoder will be enhanced to support this feature). Decoders must track the previously decoded frame's endpoints/selectors for all mipmap levels (if any), not just the top level's.
Skip blocks always refer to the previous frame. I-frames cannot use skip blocks (encoded as endpoint predictor index 2).
This section will include several example .basis file bitstreams, along with their decoded equivalents, which should be helpful for new decoder verification.