Separate from the DNA-RNA-Protein process are two separate processes for cell duplication called mitosis and meiosis. Both processes involve a cell replicating its DNA, and it is within this replication that mutations can occur. Perhaps there was a copying mistake, or a random x-ray damaged part of the DNA; in any case, the DNA code we begin with can undergo small alterations. Each mutation during this replication process could possibly have a large impact of the earlier transcription and translation process, such that we could have a new stop codon within a gene, or we could place a different amino acid into our protein chain. While some changes may be very harmful and lead to genetic diseases, others could provide an unexpected benefit and new functionality. These underlying mutations introduce variety into the gene pool for a population, and allow natural selection and evolution to occur.
While the language of DNA was not understood until 1953 by James Watson and Francis Crick with the help of data from Rosalind Franklin, similar properties of mutation were noticed nearly 100 years earlier by Charles Lutwidge Dodgson (aka Lewis Carroll) , who is most famous for his novel "Alice in Wonderland." In an article for Vanity Fair, Carroll created a puzzle called Doublets, which explored the effects of mutation on English words. His rules description is as follows:
The rules of the Puzzle are simple enough. Two words are proposed, of the same length; and the Puzzle consists in linking these together by interposing other words, each of which shall differ from the next word in one letter only. That is to say, one letter may be changed in one of the given words, then one letter in the word so obtained, and so on, till we arrive at the other given word. The letters must not be interchanged among themselves, but each must keep to its own place. As an example, the word 'head' may be changed into 'tail' by interposing the words 'heal, teal, tell, tall'. I call the given words 'a Doublet', the interposed words 'Links', and the entire series 'a Chain', of which I here append an example:
H E A D h e a l t e a l t e l l t a l l T A I LIt is, perhaps, needless to state that it is de rigueur that the links should be English words, such as might be used in good society.
(More details on the connection between Lewis Carroll and genetics can be found in an article by David B. Searls entitled From Jabberwocky to Genome: Lewis Carroll and Computational Biology).
Our task for this lab is to create a Python program that lets a user play the Doublets game and enforces all the rules given above by Lewis Carroll.
Write your answers in a Google Doc called Lab 9
.
To demonstrate the eventual behavior of the program, here is a sample run of the Doublets game.
What is the starting word? CAT What is the ending word? DOG Start = CAT Current = CAT End = DOG Which character do you want to change? (the first character is 1) 3 What is your new character? B Current = CAB End = DOG Which character do you want to change? (the first character is 1) 2 What is your new character? O Current = COB End = DOG Which character do you want to change? (the first character is 1) 3 What is your new character? G Current = COG End = DOG Which character do you want to change? (the first character is 1) 1 What is your new character? D Solution path found in 4 steps. CAT -> CAB -> COB -> COG -> DOG
A second run of the program is shown below, where the user made many mistakes.
What is the starting word? log What is the ending word? worm The lengths are not equal. Please try again. What is the ending word? sfu sfu is not a word. Please try again. What is the ending word? bug Start = LOG Current = LOG End = BUG Which character do you want to change? (the first character is 1) 0 0 is out of range. Please try again. Which character do you want to change? (the first character is 1) 4 4 is out of range. Please try again. Which character do you want to change? (the first character is 1) 1 What is your new character? 4 4og is not a valid word Current = LOG End = BUG Which character do you want to change? (the first character is 1) 1 What is your new character? b Current = BOG End = BUG Which character do you want to change? (the first character is 1) x x is not a number. Please try again. Which character do you want to change? (the first character is 1) 2 What is your new character? uu uu is more than one character. Please try again. Which character do you want to change? (the first character is 1) 2 What is your new character? u Solution path found in 2 steps. LOG -> BOG -> BUG
Add the pseudocode to your Lab 9 Google Doc
.
english-words.txt
and the python module
dictionary.py
files to your working directory.
The dictionary module contains one function
valid_word(word, file)
, which will return True if the word is found in the file and
False otherwise.
Now take your pseduocode description and begin to implement your program in Python.
Your program should be named doublets.py
. Make sure to import the dictionary module
before trying to use the valid_word function.
Write your code in small sections and test incrementally.
The approach we will take is to start with your basic functions and build your program from the bottom up. Write each function individually and then test to make sure it will provide the functionality you need. In this way you will gradually build your whole Python program and know that each piece will work as you need.
Once your program is working, try out some of the transformations below
doublets.py
program lets the user chose what character to change at each step,
and then verifies that each new word is a valid word. Can you think of a way to always find the
minimum number of steps to transform one word to another? How will you know if a path
between two words can be found?
You must hand in: