[LRUG] How would you turn a bit stream into a character stream?

Chris Parsons chris.p at rsons.org
Mon Jun 11 06:15:28 PDT 2012


Hmm, interesting. Feels like a bug causing 13-15 to appear three times as often. Can we see the code?

I would pad the number of characters in the set up to the next power of 2 (for 26 characters, that's 32, or 5 bits). Then take each 5 bits, convert to index in set, and throw away indexes greater than your set size. That's lossy, but it's random so it doesn't matter(?), and should ensure that the rest of the characters are equally distributed.

Chris

On 11 Jun 2012, at 13:56, James Coglan wrote:

> Hi all,
> 
> I have a problem I'm not sure how to solve, more of a general programming problem than a Ruby thing. Basically, say you have an indefinite stream of random bits:
> 
> 1100111001110110101100000110101011011110110011111001101000001000111000100000011001111110100010000001000001111110100010010100111111011101110110011111001101010101101111111011101111111000010001001001100100100101011100001001000001110101011101101101010011100111 ...
> 
> and you want to turn it into a stream of characters from an arbitrary-size list of characters, let's say the 26 letters from A to Z. You want the letters to be evenly distributed in the output; each letter should have an equal probability of appearing.
> 
> This is trivial if the character set's size is a power of two, since you can chop off log2(N) bits at a time and turn them into a valid index. But I'm not sure how to do it correctly for arbitrary set sizes. Suggestions?
> 
> -- 
> James Coglan
> http://jcoglan.com
> +44 (0) 7771512510
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org

--
Chris Parsons
chris.p at rsons.org
http://twitter.com/chrismdp
http://pa.rsons.org




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20120611/7d193032/attachment.html>


More information about the Chat mailing list