python - Finding the k-mer that results in minimum hamming distance in a list of dna -
i writing program given input k=some number , dna=a list of fragments of dna, output should give k-mer of size k has minimum hamming distance in array of strings. have 3 functions, 1. 1 calculates hamming distance between k-mer , different windows of fragment dna , returns hamming distance of window lowest score, 2. 1 generates possible k-mers of size k, , 3. 1 iterates through of windows of size k , every possible k-mer. unfortunately, program giving me output aaa, incorrect. know logic error not in combination(k) nor in hammingdistance, since i've used them before correct result.
import itertools def combination(k): bases=['a','t','g','c'] combo=[''.join(p) p in itertools.product(bases, repeat=k)] return combo def hammingdistance(pattern, seq): if pattern == seq: return 0 else: dist=0 in range(len(seq)): if pattern[i] != seq[i]: dist += 1 return dist def median_string(k, dna): k_mers = combination(k) distance = 0 temp = 1000000000000000000 string in dna: hamming = 1000000000000000000 c = 0 k_mer in k_mers: subset in string[c: len(string) - k]: if hamming > hammingdistance(k_mer, string[c : c+k]): hamming = hammingdistance(k_mer, string[c : c+k]) c += 1 distance += hamming if distance < temp: temp = distance best_pattern = k_mer distance = 0 return best_pattern
it turned out indentation error in last conditional.
def median_string(k, dna): k_mers = combination(k) distance = 0 temp = 1000000000000000000 k_mer in k_mers: string in dna: hamming = 1000000000000000000 c = 0 subset in string[c: len(string) - k]: if hamming > hammingdistance(k_mer, string[c : c+k]): hamming = hammingdistance(k_mer, string[c : c+k]) c += 1 distance += hamming if distance < temp: temp = distance best_pattern=k_mer distance=0 return best_pattern
Comments
Post a Comment