I'm trying to build a variable-bitwidth binary encoder/decoder for a protocol in a networked personal project I'm designing... and my stupid brain has completely jammed up on how to grasp variable-bitwidth binary encoding/decoding. I've been staring at the same ~18 SLOC for about four days now and I still feel utterly out of my depth. I do feel I should be able to grasp this, but unfortunately my brain randomly conks out when I try to track diverse numeric relationships :(
Theoretically, what I want to do is incredibly simple. I declare a list of bitwidths, and the value I want to store within that bitwidth, like so:
[10 bits]: 1013
Taking the bitwidths into account, the values would be progressively sequenced into a binary string like so:
1 bit
| 2 bits
| ||
/-10 bits--\ | \/ /------ 19 bits ------\
12345678 12 3 45 678 12345678 12345678
11111101 01 1 11 000 11001000 00001001
\----------/ | \/ \---------------------/
1012 1 3 51209
The bitstream in bold above is what I transmit down the wire:
11111101 253 FD
01111000 120 78
11001000 200 C8
00001001 9 09
At the other end, I map the list {10, 1, 2, 19}
(the set of bitwidths denoted at the top) against the sequence of bytes I've received, which then leaves me with the original set of values I wanted to transmit.
While this is primarily a general algorithmic question and the semantics of any particular language are arguably irrelevant, I should mention that I'm using PHP for this - and part of the reason I'm so stuck is that I'm trying to work out how to avoid resorting to string manipulation, which PHP seems to be obsessively inclined to encourage one to favor over proper numeric processing. However, this function will (hopefully) be processing data at 2-to-3-digit-Mbps speeds, and I want the routines to be as fast as possible. (Incidentally, I'd use another language, but PHP is the one I know the most >_>)
I do understand the basics of binary, and I should stress that I do actually have a working encoder, I think (it outputs the binary data backwards) - but I have no real idea what I'm doing here. I'm not asking "how do I write this function," I'm asking "what are the structural mechanics of what I'm doing and how do I even grasp this."
I should also note that this problem isn't a thinly disguised homework exercise; it's for the network library component in a personal software project I'm designing that, hopefully, will teach me about networking, event handling, and concurrency. Hopefully I can grasp all the other things I'll need to implement... lol