编码的算法 Variable Radix Huffman Encoding

Description

Huffman encoding is a method of developing an optimal encoding of the symbols in a source alphabet using symbols from a target alphabet when the frequencies of each of the symbols in the source alphabet are known. Optimal means the average length of an encoded message will be minimized. In this problem you are to determine an encoding of the first N uppercase letters (the source alphabet, S1 through SN, with frequencies f1 through fN) into the first R decimal digits (the target alphabet, T1 through TR).
Consider determining the encoding when R=2. Encoding proceeds in several passes. In each pass the two source symbols with the lowest frequencies, say S1 and S,2,, are grouped to form a new "combination letter" whose frequency is the sum of f1 and f2. If there is a tie for the lowest or second lowest frequency, the letter occurring earlier in the alphabet is selected. After some number of passes only two letters remain to be combined. The letters combined in each pass are assigned one of the symbols from the target alphabet. The letter with the lower frequency is assigned the code 0, and the other letter is assigned the code 1. (If each letter in a combined group has the same frequency, then 0 is assigned to the one earliest in the alphabet. For the purpose of comparisons, the value of a "combination letter" is the value of the earliest letter in the combination.) The final code sequence for a source symbol is formed by concatenating the target alphabet symbols assigned as each combination letter using the source symbol is formed. The target symbols are concatenated in the reverse order that they are assigned so that the first symbol in the final code sequence is the last target symbol assigned to a combination letter. The two illustrations below demonstrate the process for R=2.
Symbol Frequency | Symbol Frequency

    A         5                      |  A         7

    B         7                      |  B         7

    C         8                      |  C         7

    D        15                      |  D         7

Pass 1: A and B grouped | Pass 1: A and B grouped

Pass 2: {A,B} and C grouped | Pass 2: C and D grouped

Pass 3: {A,B,C} and D grouped | Pass 3: {A,B} and {C,D} grouped

Resulting codes: A=110, B=111, C=10, D=0 | Resulting codes: A=00, B=01, C=10, D=11

Avg. length = (3*5+3*7+2*8+1*15)/35=1.91 | Avg. length = (2*7+2*7+2*7+2*7)/28=2.00

When R is larger than 2, R symbols are grouped in each pass. Since each pass effectively replaces R letters or combination letters by 1 combination letter, and the last pass must combine R letters or combination letters, the source alphabet must contain k*(R-1)+R letters, for some integer k. Since N may not be this large, an appropriate number of fictitious letters with zero frequencies must be included. These fictitious letters are not to be included in the output. In making comparisons, the fictitious letters are later than any of the letters in the alphabet.
Now the basic process of determining the Huffman encoding is the same as for the R=2 case. In each pass, the R letters with the lowest frequencies are grouped, forming a new combination letter with a frequency equal to the sum of the letters included in the group. The letters that were grouped are assigned the target alphabet symbols 0 through R-1. 0 is assigned to the letter in the combination with the lowest frequency, 1 to the next lowest frequency, and so forth. If several of the letters in the group have the same frequency, the one earliest in the alphabet is assigned the smaller target symbol, and so forth. The illustration below demonstrates the process for R=3.
Symbol Frequency

    A             5

    B          7

    C          8

    D         15

Pass 1: ? (fictitious symbol), A and B are grouped

Pass 2: {?,A,B}, C and D are grouped

Resulting codes: A=11, B=12, C=0, D=2

Avg. length = (2*5+2*7+1*8+1*15)/35=1.34

Input

The input will contain one or more data sets, one per line. Each data set consists of an integer value for R (between 2 and 10), an integer value for N (between 2 and 26), and the integer frequencies f1 through fN, each of which is between 1 and 999. The end of data for the entire input is the number 0 for R; it is not considered to be a separate data set.
Output

For each data set, display its number (numbering is sequential starting with 1) and the average target symbol length (rounded to two decimal places) on one line. Then display the N letters of the source alphabet and the corresponding Huffman codes, one letter and code per line. The examples below illustrate the required output format.
Sample Input

2 5 5 10 20 25 40
2 5 4 2 2 1 1
3 7 20 5 8 5 12 6 9
4 6 10 23 18 25 9 12
0
Sample Output

Set 1; average length 2.10
A: 1100
B: 1101
C: 111
D: 10
E: 0

Set 2; average length 2.20
A: 11
B: 00
C: 01
D: 100
E: 101

Set 3; average length 1.69
A: 1
B: 00
C: 20
D: 01
E: 22
F: 02
G: 21

Set 4; average length 1.32
A: 32
B: 1
C: 0
D: 2
E: 31
F: 33

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

报告相同问题？

关注问题

java中建一个新项目就会报一个for input string:H under radix 16为什么 java 有问必答
2022-04-03 11:21

回答 2 已采纳编辑器问题，换一个版本安装。链接：https://pan.baidu.com/s/1jkWhdIe2QJ36ZzuUEBCeCA提取码：0000
帮我看一看我的这个程序呗数据结构算法
2022-11-05 16:59

回答 1 已采纳没什么问题，堆栈最大容量值#define Maxsize 10 这里预定义多余了，见修改处，其它没发现问题，65536 转 8 进制得 200000，这结果正确。 #include <stdi
将二进制转化为十六进制 c语言数据结构算法
2022-11-10 15:40

回答 2 已采纳二进制转化为十六进制，每四位二进制数组成一位十六进制数，按这个规律就可以了。
UVA240 Variable Radix Huffman Encoding
2021-02-21 18:56

爱编程的大李子的博客 UVA240 可变基数霍夫曼编码题目描述哈夫曼编码是一种最优编码方法。根据已知源字母表中字符出现的频率，将源字母表中字符编码为目标字母表中字符，最优的意思是编码信息的平均长度最小。在该问题中，你需要将 N 个...
java中intger.parseint问题 java
2023-03-11 13:49

回答 2 已采纳二进制数，只有 0 和 1， 16肯定不对啊Integer.parseInt 两个入参，第一个要转换的数字字符串，第二个进制，意思是将指定进制的数转换为十进制数。
C语言初学者想问一下同一段代码（进制转换）同一个输入，为什么我用在线编译和codeblocks编译出来结果不一样？ c语言
2021-11-04 00:23

回答 1 已采纳使用的编译器不一样，long int分配的大小可能有差别，你用个小的数字试下。 PS：有问题可以再问我，学习数据结构和算法、C/C++、Linux可以关注我。
vivado2018.3仿真问题其他
2023-03-19 18:16

回答 1 已采纳该回答引用NewBing 您好，这是Bing。根据我的搜索结果①②③，可能的原因有以下几种：您的仿真时间太短，没有覆盖到正弦波的周期。您的仿真步长太大，没有捕捉到正弦波的变化。您的仿真信号范围太小或
【UVA No. 240】可变基哈夫曼编码 Variable Radix Huffman Encoding
2022-09-28 11:09

Ding Jiaxiong的博客【UVA No. 240】可变基哈夫曼编码 Variable Radix Huffman Encoding
c++如何巧妙变为c，青高收捷达 c语言
2022-07-28 15:06

回答 2 已采纳供参考： #include <stdio.h> #include <stdlib.h> #include <string.h> char* intToA(in
代码input老是报错，加了coles不报错但运行不了 java
2022-03-28 19:07

回答 2 已采纳你好，scanner.nextInt()的用法有误，scanner.nextInt()的参数指定了数字的进制。最大值为36。你代码中int stb = input.nextInt(89);已经会报错j
数据结构多关键字排序 c语言数据结构有问必答
2021-11-13 21:08

回答 1 已采纳麻烦用代码块格式化一下代码好吧，这实在是没法看啊
数据结构与算法基础知识和代码示例 Coding Interview University: Algorithms and Data Structure
2023-10-09 02:02

禅与计算机程序设计艺术的博客核心算法原理和具体操作步骤以及数学模型公式详细讲解 We now move on to discuss some core algorithms and data structures that are used in technical interviews: Arrays and Strings Arrays and strings are ...
MyEclipse6.5按网上的方法安装了反编译软件后出错，无法打开.class文件
2010-09-26 11:23

回答 1 已采纳 jad路径没有配置正确。在path to decompiler中配置正确完整路径，如 D:\Java\jdk1.6.0_20\bin\jad.exe 注意是jad.exe
可变基哈夫曼编码
2022-06-19 09:42

chengqiuming的博客 Variable Radix Huffman Encoding - UVA 240 - Virtual Judgehttps://vjudge.net/problem/UVA-240输入将包含一个或多个数据集，每行一个。每个数据集都包含整数值 R、整数值 N 和整数频率 f1到 fn。整个数据都以 R ...
算法列表
2014-11-25 19:44

lonelyrains的博客 http://en.wikipedia.org/wiki/List_of_algorithms
POJ前面的题目算法思路【转】
2016-01-08 22:07

梁山伯liangrx06的博客 1000 A+B Problem 送分题 49％ 2005-5-7 1001 Exponentiation 高精度 85％ 2005-5-7 1002 487-3279 n/a 90％ 2005-5-7 1003 Hangover 送分题 62％ 2005-5-
算法列表 (转载)
2014-12-31 09:31

小菜鸟上学校的博客 http://en.wikipedia.org/wiki/List_of_algorithms The following is a list of algorithms along with one-line descriptions for each. Contents [hide] ... algorithms
PKU_算法_分类
2009-10-22 14:44

ChipArtist的博客主流算法：1.搜索　//回溯2.DP（动态规划）　3.贪心　4.图论　//Dijkstra、最小生成树、网络流5.数论　//解模线性方程6.计算几何　//凸壳、同等安置矩形的并的面积与周长7.组合数学　//Polya定理8.模拟　9.数据结构...
pku算法
2009-10-22 15:11

ChipArtist的博客主流算法：1.搜索　//回溯2.DP（动态规划）　3.贪心　4.图论　//Dijkstra、最小生成树、网络流5.数论　//解模线性方程6.计算几何　//凸壳、同等安置矩形的并的面积与周长7.组合数学　//Polya定理8.模拟　9.数据结构...
POJ刷题顺序
2016-11-17 00:43

Zpadger的博客算法), 2138, 2151, 2161, 2178, 推荐： 1015, 1635, 1636(挺好的), 1671,1682, 1692(优化), 1704, 1717, 1722, 1726, 1732, 1770, 1821, 1853, 1949, 2019, 2127, 2176, 2228, 2287, 2342...
没有解决我的问题, 去提问

悬赏问题

¥15 关于#python#的问题：求帮写python代码
¥20 MATLAB画图图形出现上下震荡的线条
¥15 LiBeAs的带隙等于0.997eV,计算阴离子的N和P
¥15 关于#windows#的问题：怎么用WIN 11系统的电脑克隆WIN NT3.51-4.0系统的硬盘
¥15 来真人，不要ai！matlab有关常微分方程的问题求解决，
¥15 perl MISA分析p3_in脚本出错
¥15 k8s部署jupyterlab，jupyterlab保存不了文件
¥15 ubuntu虚拟机打包apk错误
¥199 rust编程架构设计的方案有偿
¥15 回答4f系统的像差计算

码龄粉丝数原力等级 --

编码的算法 Variable Radix Huffman Encoding

0条回答默认最新

悬赏问题

编码的算法 Variable Radix Huffman Encoding

0条回答 默认 最新

悬赏问题

0条回答默认最新