conda env list
Hopefully you only see the "root" environment. Then type
source activate root
to get the mappings correct for your programs. Now type
python
and you should get the Anaconda version 3.X. Here is what I see
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>
Now in the python console, type
import nltk nltk.download()
Use this GUI to find the WordNet corpus and download it.
Next, exit out of python, back to the terminal. Time to install gensim. The easiest way I found was to type
easy_install -U gensim
conda env list
Hopefully you only see the "root" environment. Then type
activate root
to get the mappings correct for your programs. Now type
python
and you should get the Anaconda version 3.X. Here is what I see
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>
Now in the python console, type
import nltk nltk.download()
Use this GUI to find the WordNet corpus and download it.
Next, exit out of python, back to the terminal. Time to install gensim. The easiest way I found for Windows was to type
pip install -U gensim
If this doesn't work, try
conda install -c anaconda gensim
Record the number of unique tokens before and after stemming.
Which document shrunk by the largest percentage?
Which two words in the above list are the most similar according to WordNet?
Which word has the largest hypernym closure (in other words, which word is the most specific?)
Find four main characters, each from a different books in our corpus. What 10 words are most similar to each of these according to the Word2Vec calculations? Research any words or characters unfamiliar to you and discuss why they might be similar.
Test out the man is to king as woman is to ? analogy using the books dataset.
Analyze two other analogies that test two different relationships commonly found between words in analogies.
For each topic, use this model to find the document that is most relevant. Discuss the sensibility of this matching.