Problem Description
sevenzero is very interesting in Bioinformation and have done some research on it. One day, sevenzero found a phenomenon called Microgene. Microgene is a special fragment in the DNA, and different Microgenes may have the same hereditary effect. Microgene works if and only if there are more than one Microgenes(Microgenes may overlap) with the same hereditary effect in the DNA. To finish his paper, sevenzero wants to know how many different DNAs with length L which contain the hereditary effect caused by Microgenes.

To simplify the problem, a DNA or a Microgene is considerd as a string consisting of character 'A', 'T', 'C' and 'G'. And a Microgene is in the DNA if the Microgene string is the substring of the DNA string. All Microgenes given are different and with the same hereditary effect.

Input
There are several test cases in the input. Each case begins with a line with an integer N (1 ≤ N ≤ 6) and L (1 ≤ N ≤ 1000000), denoting the number of Microgenes and the length of DNA. The following N lines contain N strings representing the Microgenes.The length of the Microgene is no more than 5. The input is terminated by EOF.

Output
One line for each case, the answer modulo 10007.

Sample Input
2 3
AT
TC
2 3
ATC
T
3 1000000
ATCG
TCGT
CTAG

Sample Output
1
11
5063

Description 众所周知，人类基因可以看作一个碱基对序列，它包含了4种核苷酸，简记为A,C,G,T。让我们观察这样一段基因序列 “ACCAGGTT”，这段序列共由8个核苷酸构成，其中第1位和第4位是碱基“A”，第2位和第3位是碱基“C”，第5位和第6位是碱基“G”，第7位和第8位是碱基“T”。Tom构造了这样一个0,1矩阵： 1, 0, 0, 1, 0, 0, 0, 0 0, 1, 1, 0, 0, 0, 0, 0 0, 1, 1, 0, 0, 0, 0, 0 1, 0, 0, 1, 0, 0, 0, 0 0, 0, 0, 0, 1, 1, 0, 0 0, 0, 0, 0, 1, 1, 0, 0 0, 0, 0, 0, 0, 0, 1, 1 0, 0, 0, 0, 0, 0, 1, 1 如果第i位的碱基与第j位的碱基一样，那么0,1矩阵的i行j列为1，否则为0。如果基因序列X与基因序列Y等长且具有相同的0,1矩阵，Tom就会认为X与Y是相似的基因序列。 现在的问题是：给你两段长度为N的基因序列，请你帮助Tom判断它们是否相似。 Input 可以有多组测试数据，每组数据第1行输入一个正整数N（1≤N≤1000000），第2行和第3行分别输入两段长度为N的基因序列（只由A,C,G,T四种字符构成）。输入直至N=0为结尾。 Output 每组数据输出仅一行，如果相似则输出 YES，否则输出 NO。 Sample Input 2 AA TG 6 ACCGTT GAATCC 0 Sample Output NO YES

DNA遗传算法的一个难题的解决的思路，怎么运用C语言解决这个题的办法

Problem Description Every kind of living creatures has a kind of DNA. The nucleotide bases from which DNA is built are A (adenine), C (cytosine), G (guanine), and T (thymine). Sometimes if two DNA of two living creatures have the same substring, and the length is beyond a certain percentage of the whole length, we many consider whether the two living creatures have the same ancestor. And we can separate them into a certain species temporarily for our research, and we say the two living creatures are similar Make sure if A is similar with B, and B is similar with C, but C is not similar with A, we also separate A, B and C into a kind, for during the evolution, there happens aberrance. Now we have some kinds of living creatures and their DNA, just tell us how many kinds of living creatures we can separate. Input There are a lot of cases. In each case, in the first line there are two numbers N and P. N means the number of kinds of living creatures. If two DNA are similar, there exist a substring, and its length is beyond the percentage of any DNA of the two, and P is just the percentage. And 1<=N<=100, and 1<=P<100 (P is 100, which means two DNA are similar if and only if they are the same, so we make sure P is smaller than 100). The length of each DNA won't exceed 100. Output For each case, just print how many kinds living creatures we can separate. Sample Input 3 10.0 AAA AA CCC Sample Output Case 1: 2

DNA上ACGT四种蛋白质序列的程序表达和计算，怎么利用C语言的程序编写的方式加以有效实现的

Problem Description Every kind of living creatures has a kind of DNA. The nucleotide bases from which DNA is built are A (adenine), C (cytosine), G (guanine), and T (thymine). Sometimes if two DNA of two living creatures have the same substring, and the length is beyond a certain percentage of the whole length, we many consider whether the two living creatures have the same ancestor. And we can separate them into a certain species temporarily for our research, and we say the two living creatures are similar Make sure if A is similar with B, and B is similar with C, but C is not similar with A, we also separate A, B and C into a kind, for during the evolution, there happens aberrance. Now we have some kinds of living creatures and their DNA, just tell us how many kinds of living creatures we can separate. Input There are a lot of cases. In each case, in the first line there are two numbers N and P. N means the number of kinds of living creatures. If two DNA are similar, there exist a substring, and its length is beyond the percentage of any DNA of the two, and P is just the percentage. And 1<=N<=100, and 1<=P<100 (P is 100, which means two DNA are similar if and only if they are the same, so we make sure P is smaller than 100). The length of each DNA won't exceed 100. Output For each case, just print how many kinds living creatures we can separate. Sample Input 3 10.0 AAA AA CCC Sample Output Case 1: 2

c语言，回文序列的判断，runtime error 求大神解答

#include<stdio.h> #include<string.h> int main(){ int i,*pi,l,*pl,mark,*pmark; pi=&i; pl=&l; pmark=&mark; char str[100000],*pstr; pstr=str; while((scanf("%s",pstr)!=EOF)&&strcmp（str,"2013"）!=0){ *pl=strlen(pstr); mark=1; for(*pi=0;*pi<(l/2);*pi++){ if(*(pstr+i)!=*(pstr+(*pl-*pi-1))){ mark=0; break; } } if(mark){ printf("YES\n"); } else{ printf("NO\n"); } } return 0; }

DNA Translation DNA序列的问题

Description Deoxyribonucleic acid (DNA) is composed of a sequence of nucleotide bases paired together to form a double-stranded helix structure. Through a series of complex biochemical processes the nucleotide sequences in an organism's DNA are translated into the proteins it requires for life. The object of this problem is to write a computer program which accepts a DNA strand and reports the protein generated, if any, from the DNA strand. The nucleotide bases from which DNA is built are adenine, cytosine, guanine, and thymine (hereafter referred to as A, C, G, and T, respectively). These bases bond together in a chain to form half of a DNA strand. The other half of the DNA strand is a similar chain, but each nucleotide is replaced by its complementary base. The bases A and T are complementary, as are the bases C and G. These two "half-strands" of DNA are then bonded by the pairing of the complementary bases to form a strand of DNA. Typically a DNA strand is listed by simply writing down the bases which form the primary strand (the complementary strand can always be created by writing the complements of the bases in the primary strand). For example, the sequence TACTCGTAATTCACT represents a DNA strand whose complement would be ATGAGCATTAAGTGA. Note that A is always paired with T, and C is always paired with G. From a primary strand of DNA, a strand of ribonucleic acid (RNA) known as messenger RNA (mRNA for short) is produced in a process known as transcription. The transcribed mRNA is identical to the complementary DNA strand with the exception that thymine is replaced by a nucleotide known as uracil (hereafter referred to as U). For example, the mRNA strand for the DNA in the previous paragraph would be AUGAGCAUUAAGUGA. It is the sequence of bases in the mRNA which determines the protein that will be synthesized. The bases in the mRNA can be viewed as a collection of codons, each codon having exactly three bases. The codon AUG marks the start of a protein sequence, and any of the codons UAA, UAG, or UGA marks the end of the sequence. The one or more codons between the start and termination codons represent the sequence of amino acids to be synthesized to form a protein. For example, the mRNA codon AGC corresponds to the amino acid serine (Ser), AUU corresponds to isoleucine (Ile), and AAG corresponds to lysine (Lys). So, the protein formed from the example mRNA in the previous paragraph is, in its abbreviated form, Ser-Ile-Lys. The complete genetic code from which codons are translated into amino acids is shown in the table below (note that only the amino acid abbreviations are shown). It should also be noted that the sequence AUG, which has already been identified as the start sequence, can also correspond to the amino acid methionine (Met). So, the first AUG in a mRNA strand is the start sequence, but subsequent AUG codons are translated normally into the Met amino acid. First base in codon Second base in codon Third base in codon U C A G U Phe Ser Tyr Cys U Phe Ser Tyr Cys C Leu Ser --- --- A Leu Ser --- Trp G C Leu Pro His Arg U Leu Pro His Arg C Leu Pro Gln Arg A Leu Pro Gln Arg G A Ile Thr Asn Ser U Ile Thr Asn Ser C Ile Thr Lys Arg A Met Thr Lys Arg G G Val Ala Asp Gly U Val Ala Asp Gly C Val Ala Glu Gly A Val Ala Glu Gly G Input The input for this program consists of strands of DNA sequences, one strand per line, from which the protein it generates, if any, should be determined and output. The given DNA strand may be either the primary or the complementary DNA strand, and it may appear in either forward or reverse order, and the start and termination sequences do not necessarily appear at the ends of the strand. For example, a given input DNA strand to form the protein Ser-Ile-Lys could be any of ATACTCGTAATTCACTCC, CCTCACTTAATGCTCATA, TATGAGCATTAAGTGAGG, or GGAGTGAATTACGAGTAT. The input will be terminated by a line containing a single asterisk character. Output You may assume the input to contain only valid, upper-case, DNA nucleotide base letters (A, C, G, and T). No input line will exceed 255 characters in length. There will be no blank lines or spaces in the input. Some sequences, though valid DNA strands, do not produce valid protein sequences; the string "*** No translatable DNA found ***" should be output when an input DNA strand does not translate into a valid protein. Sample Input ATACTCGTAATTCACTCC CACCTGTACACAGAGGTAACTTAG TTAATACGACATAATTAT GCCTTGATATGGAGAACTCATTAGATA AAGTGTATGTTGAATTATATAAAACGGGCATGA ATGATGATGGCTTGA * Sample Output Ser-Ile-Lys Cys-Leu-His Ser-Tyr *** No translatable DNA found *** Leu-Asn-Tyr-Ile-Lys-Arg-Ala Met-Met-Ala

DNA的进化的推算问题，怎么使用C语言的程序的编写的模式的思想去完成程序的实现

Problem Description Every kind of living creatures has a kind of DNA. The nucleotide bases from which DNA is built are A (adenine), C (cytosine), G (guanine), and T (thymine). Sometimes if two DNA of two living creatures have the same substring, and the length is beyond a certain percentage of the whole length, we many consider whether the two living creatures have the same ancestor. And we can separate them into a certain species temporarily for our research, and we say the two living creatures are similar Make sure if A is similar with B, and B is similar with C, but C is not similar with A, we also separate A, B and C into a kind, for during the evolution, there happens aberrance. Now we have some kinds of living creatures and their DNA, just tell us how many kinds of living creatures we can separate. Input There are a lot of cases. In each case, in the first line there are two numbers N and P. N means the number of kinds of living creatures. If two DNA are similar, there exist a substring, and its length is beyond the percentage of any DNA of the two, and P is just the percentage. And 1<=N<=100, and 1<=P<100 (P is 100, which means two DNA are similar if and only if they are the same, so we make sure P is smaller than 100). The length of each DNA won't exceed 100. Output For each case, just print how many kinds living creatures we can separate. Sample Input 3 10.0 AAA AA CCC Sample Output Case 1: 2

Problem Description Every kind of living creatures has a kind of DNA. The nucleotide bases from which DNA is built are A (adenine), C (cytosine), G (guanine), and T (thymine). Sometimes if two DNA of two living creatures have the same substring, and the length is beyond a certain percentage of the whole length, we many consider whether the two living creatures have the same ancestor. And we can separate them into a certain species temporarily for our research, and we say the two living creatures are similar Make sure if A is similar with B, and B is similar with C, but C is not similar with A, we also separate A, B and C into a kind, for during the evolution, there happens aberrance. Now we have some kinds of living creatures and their DNA, just tell us how many kinds of living creatures we can separate. Input There are a lot of cases. In each case, in the first line there are two numbers N and P. N means the number of kinds of living creatures. If two DNA are similar, there exist a substring, and its length is beyond the percentage of any DNA of the two, and P is just the percentage. And 1<=N<=100, and 1<=P<100 (P is 100, which means two DNA are similar if and only if they are the same, so we make sure P is smaller than 100). The length of each DNA won't exceed 100. Output For each case, just print how many kinds living creatures we can separate. Sample Input 3 10.0 AAA AA CCC Sample Output Case 1: 2

、请设计一个用于保存和处理DNA序列的类DNASequence，该类具有以下特征和功能：

3、请设计一个用于保存和处理DNA序列的类DNASequence，该类具有以下特征和功能：  一个翻译表，定义为类属性，名为transcription_table，类型是字典，用于将DNA符号A、T、G、C分别转换为对应的符号，即A到U、T到A、G到C、C到G。  一个限制酶对照表，定义为类属性，名为enz_dict，类型是字典。所谓限制酶，指的是识别特定DNA序列并在识别区内产生截断的蛋白质。本题只关注两种限制酶，一个是’EcoRI’，识别’ GAATTC’序列，'EcoRV'，识别'GATATC'序列。  构造函数__init__(self, seqstring)的参数seqstring为一个字符串，代表一个DNA序列，DNASequence有一个对象属性seqstring保存该字符串。要求将参数seqstring中所有字符转换为大写形式再保存到对象属性seqstring中。  有一个对象方法transcription(self)，将对象属性seqstring保存的DNA序列，逐符号翻译为对应符号的DNA序列，如”ATG”翻译为”UAC”。  一个对象方法restriction(self, enz)，enz为限制酶名称，类型为字符串。该方法的功能是统计对象属性seqstring中，所给限制酶对应的DNA序列的出现次数。如果不含限制酶对应的DNA序列，返回0。  重载len运算，即重新定义特殊函数__len__(self)，用于返回对象属性seqstring的长度。  注意：本题不考虑__init__参数seqstring中字符是否合法，默认所给字符都是ATGC之一。 运行示例： >>> virus = DNASequence(’atggagagccttgttcttggtgtcaa’) >>> virus.seqstring ’ATGGAGAGCCTTGTTCTTGGTGTCAA’ >>> virus.transcription() ’ UACCUCUCGGAACAAGAACCACAGUU’ >>> other_virus = DNASequence(’atgatatcggagaggatatcggtgtcaa’) >>> other_virus.restriction(’EcoRV’) 2 >>>len(virus) 26 代码框架：请将该框架拷贝出去，保存在DNASequence.py文件中。如要测试，请在另外的文件里写测试程序。DNASequence.py只能包含DNASequence类的实现代码，不能有其他的测试代码。提交答题时，只提交DNASequence.py。如有其他代码影响了老师改卷，由此造成的扣分后果，由自己承担。 class DNASequence: transcription_table = {} #翻译表 enz_dict = {} #限制酶对照表 def __init__(self, seqstring): #请在下面编写程序 #请勿修改下面的程序 def __len__(self): # 请在下面编写程序 # 请勿修改下面的程序 def restriction(self, enz): # 请在下面编写程序 # 请勿修改下面的程序 def transcription(self): # 请在下面编写程序 # 请勿修改下面的程序

Description A biologist experimenting with DNA modification of bacteria has found a way to make bacterial colonies sensitive to the surrounding population density. By changing the DNA, he is able to "program"the bacteria to respond to the varying densities in their immediate neighborhood. The culture dish is a square, divided into 400 smaller squares (20x20). Population in each small square is measured on a four point scale (from 0 to 3). The DNA information is represented as an array D, indexed from 0 to 15, of integer values and is interpreted as follows: In any given culture dish square, let K be the sum of that square's density and the densities of the four squares immediately to the left, right, above and below that square (squares outside the dish are considered to have density 0). Then, by the next day, that dish square's density will change by D[K] (which may be a positive, negative, or zero value). The total density cannot, however, exceed 3 nor drop below 0. Now, clearly, some DNA programs cause all the bacteria to die off (e.g., [-3, -3, ..., -3]). Others result in immediate population explosions (e.g., [3,3,3, ..., 3]), and others are just plain boring (e.g., [0, 0,...,0]). The biologist is interested in how some of the less obvious DNA programs might behave. Write a program to simulate the culture growth, reading in the number of days to be simulated, the DNA rules, and the initial population densities of the dish. Input Input to this program consists of three parts: 1. The first line will contain a single integer denoting the number of days to be simulated. 2. The second line will contain the DNA rule D as 16 integer values, ordered from D[0] to D[15], separated from one another by one or more blanks. Each integer will be in the range -3...3, inclusive. 3. The remaining twenty lines of input will describe the initial population density in the culture dish. Each line describes one row of squares in the culture dish, and will contain 20 integers in the range 0?, separated from one another by 1 or more blanks Output The program will produce exactly 20 lines of output, describing the population densities in the culture dish at the end of the simulation. Each line represents a row of squares in the culture dish, and will consist of 20 characters, plus the usual end-of-line terminator. Each character will represent the population density at a single dish square, as follows: No other characters may appear in the output. Sample Input 2 0 1 1 1 2 1 0 -1 -1 -1 -2 -2 -3 -3 -3 -3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Sample Output ##!................. #!.................. !................... .................... .................... .................... .................... .........!.......... ........!#!......... .......!#X#!........ ........!#!......... .........!.......... .................... .................... .................... .................... .................... .................... .................... ....................

python：DNA排序--求逆序数。

【问题描述】 对于给定的序列{ a[1], a[2],... , a[n]}，元素a[i] 的逆序数定义为inv(a[i])=|{a[k]|a[i]>a[k],i<k<=n}|。序列A 的逆序数定义为inv(A)=inv(a[1])+inv(a[2])+.....+inv(a[n])。 事实上，序列A 的逆序数刻画出序列A中元素已排序的程度。逆序数越小，序列A 已排序的程度就越高。当序列A 已排好序时，其逆序数为0。 生物信息学家在进行分子计算研究DNA序列时需要将若干长度相同的DNA串按其逆序数从小到大排序。 编写程序，对于给定长度相同的DNA串，按其逆序数从小到大的顺序排序。 DNA中的字符按照字符顺序比较大小，数据从"input.txt"的文件读入，并将结果输出到"output.txt"中。 【输入形式】 第一行有两个整数，分别为DNA长度L和DNA数量n 之后n行分别为n个DNA串 最后以两个0结束 【输出形式】 按逆序数从小到大每行输出一个DNA串。 【样例输入】 从input.txt输入： 10 6 AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT 00 【样例输出】 向output.txt输出： CCCGGGGGGA AACATGAAGG GATCAGATTT ATCGATGCAT TTTTGGCCAA TTTGGCCAAA

DNA进化过程计算机模拟仿真

DNA序列字符串的一个处理修改的问题，怎么使用C程序语言代码编写的过程来编程解决的？

Problem Description Biologists finally invent techniques of repairing DNA that contains segments causing kinds of inherited diseases. For the sake of simplicity, a DNA is represented as a string containing characters 'A', 'G' , 'C' and 'T'. The repairing techniques are simply to change some characters to eliminate all segments causing diseases. For example, we can repair a DNA "AAGCAG" to "AGGCAC" to eliminate the initial causing disease segments "AAG", "AGC" and "CAG" by changing two characters. Note that the repaired DNA can still contain only characters 'A', 'G', 'C' and 'T'. You are to help the biologists to repair a DNA by changing least number of characters. Input The input consists of multiple test cases. Each test case starts with a line containing one integers N (1 ≤ N ≤ 50), which is the number of DNA segments causing inherited diseases. The following N lines gives N non-empty strings of length not greater than 20 containing only characters in "AGCT", which are the DNA segments causing inherited disease. The last line of the test case is a non-empty string of length not greater than 1000 containing only characters in "AGCT", which is the DNA to be repaired. The last test case is followed by a line containing one zeros. Output For each test case, print a line containing the test case number( beginning with 1) followed by the number of characters which need to be changed. If it's impossible to repair the given DNA, print -1. Sample Input 2 AAA AAG AAAG 2 A TG TGAATG 4 A G C T AGT 0 Sample Output Case 1: 1 Case 2: 4 Case 3: -1

Java基础知识面试题（2020最新版）

Intellij IDEA 实用插件安利

1. 前言从2020 年 JVM 生态报告解读 可以看出Intellij IDEA 目前已经稳坐 Java IDE 头把交椅。而且统计得出付费用户已经超过了八成（国外统计）。IDEA 的...

MySQL数据库面试题（2020最新版）

2020阿里全球数学大赛：3万名高手、4道题、2天2夜未交卷

HashMap底层实现原理，红黑树，B+树，B树的结构原理 Spring的AOP和IOC是什么？它们常见的使用场景有哪些？Spring事务，事务的属性，传播行为，数据库隔离级别 Spring和SpringMVC，MyBatis以及SpringBoot的注解分别有哪些？SpringMVC的工作原理，SpringBoot框架的优点，MyBatis框架的优点 SpringCould组件有哪些，他们...

《Oracle Java SE编程自学与面试指南》最佳学习路线图2020年最新版（进大厂必备）