CSC277 - Project 3
Strings and Motifs
Assigned Feb 10th 11 a.m.
Due Feb 19th 9:45 a.m.
Part 1 - String Matching Algorithms
Using the Knuth-Morris-Pratt Exact String Matching algorithm, what is the prefix array
for the pattern GARGANTUAN
? Show the steps used to find this pattern in
the string PANGALACTICGARGLEGARGANTUANBLASTER
.
Part 2 - Brute Force Median String Algorithm
Write a program (in either Java or Python) to implement the BruteForceMedianSearch algorithm on
page 112 of the textbook. You must also implement the TotalDistance function. You need to
augment this algorithm to also save the starting positions of the found motif. Your output should
be in the form of a profile matrix and the resulting consensus string.
Part 3 - Greedy Motif Algorithm
Write a program (in either Java or Python) to implement the GreedyMotifSearch algorithm on
page 136 of the textbook. Your output should be in the form of a profile matrix and the
resulting consensus string.
Testing
Test you above two algorithm on the 15 sequences found in the
sample DNA file provided, searching for a
motif of length 8, and report your output.
Was there a large difference in the consensus string found?
Turn in your code, profile and consensus output, and writeup for part 1 in your csc277
directory on the cs.centenary.edu server.