编程介的小学生 2017-12-01 16:39 采纳率: 20.5%
浏览 742
已采纳

Unique Words

Problem Description
A common problem faced by electronic information providers is determining the number of unique words in a document. The case of a word does not affect its uniqueness. For example, The, tHE and The are all considered equivalent. Punctuation can appear in these documents and is handled as follows:
1) Periods '.' and exclamation marks '!' may appear at the end of a sentence and should not be considered a word, or part of a word.
2) Dashes '-' appear between hyphenated words. The hyphenated words should be considered separately.
3) Commas ',' colons ':' and semicolons ';' appear within a sentence and should not be considered a word, or part of a word.
4) Apostrophes ' appear within contractions and possessive forms. These symbols should be treated as if they never appeared (i.e., as if they were deleted from the word).

Input
The input file contains a series of documents, each separated by an entire line of text containing only the word EOD Each document will contain no more than 1,000 lines and at most 100 unique words. All input lines will not contain more than 80 characters. Numbers, control characters, and punctuation symbols not listed above will not appear in the text. An entire line containing only the string EOT identifies the end of the list of documents; note this last document is terminated by EOT and not EOD

Output
The output should be an alphabetically sorted list of all unique words, with each unique word displayed in uppercase.

Sample Input
The banker hammered home his two-part message! His message,
at times satirical, was that the bank's situation was a mess.
EOD
Hello world
EOD
This is a
final example
EOT

Sample Output
WORDS IN DOCUMENT #1
A
AT
BANKER
BANKS
HAMMERED
HIS
HOME
MESS
MESSAGE
PART
SATIRICAL
SITATUATION
THAT
THE
TIMES
TWO
WAS
WORDS IN DOCUMENT #2
HELLO
WORLD
WORDS IN DOCUMENT #3
A
EXAMPLE
FINAL
IS
THIS

  • 写回答

1条回答 默认 最新

  • threenewbee 2017-12-02 15:25
    关注
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么
  • ¥15 banner广告展示设置多少时间不怎么会消耗用户价值
  • ¥16 mybatis的代理对象无法通过@Autowired装填
  • ¥15 可见光定位matlab仿真
  • ¥15 arduino 四自由度机械臂
  • ¥15 wordpress 产品图片 GIF 没法显示
  • ¥15 求三国群英传pl国战时间的修改方法
  • ¥15 matlab代码代写,需写出详细代码,代价私
  • ¥15 ROS系统搭建请教(跨境电商用途)
  • ¥15 AIC3204的示例代码有吗,想用AIC3204测量血氧,找不到相关的代码。