Programming Assignment 3
Realization of a Spell Checker Using AVL Trees
In this assignment, you prepare height-balanced trees in order to store a dictionary of words. Assume that we are dealing with a language called Funglish for which a word list is provided. For simplicity, we assume that Funglish has only two- to eight-letter words. The word-list file contains these words in a random order and with repetitions. Funglish is case-insensitive, so you do not need to worry about distinguishing between lower and upper cases. Work consistently with upper-case letters.
Creating the AVL trees
Read the word list from the supplied file, and for each i in the range [2,8], store all i-letter words in an AVL tree Ti. Your dictionary is stored as a collection of the seven trees T2, T3, ..., T8. You may use the following data types to store the dictionary.
typedef struct _treenode { char *word; struct _treenode *left; struct _treenode *right; struct _treenode *parent; int balancefactor; } AVLtreenode; typedef AVLtreenode *AVLtree; typedef AVLtree *dictionary;This is a kind of hashed AVL forest, where an initial hashing is done with respect to the lengths of the words. Initially, create seven empty trees. Subsequently, as you read the word-list file, insert each word in the appropriate tree based upon the length of the word. Use a single function for insertion in an AVL tree.
This is a static dictionary, and sorted arrays could have served our purpose. But this exercise is meant for playing with height-balanced trees. In this exercise, you never delete any word from the dictionary, so there is no need to write the deletion function for AVL trees.
After the entire word-list is read, compute and report the heights of all the seven trees.
Using the following command, you can sort the word-list file, remove duplicates from it, and store the list in a file wordlist-sorted-nodup.txt.
$ sort wordlist.txt | uniq > wordlist-sorted-nodup.txtBuild the seven trees using this processed word-list file, and report the heights of the seven trees. It is very interesting to note that if the items input to an AVL tree appear in the sorted order, the result is a perfectly balanced binary search tree. Try to prove this assertion as a theoretical exercise (not for submission).
Spell Checking
Generate a random (purportedly Funglish) word with the following constraints.
- The length of the word is randomly chosen between two and eight (both inclusive).
- A letter in the word is chosen according to the following probability distribution. The relative frequencies of the different letters (in a total of 100) are provided in the table below.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 9 2 2 6 11 4 3 2 9 1 1 4 2 4 8 2 1 6 5 6 4 2 2 1 2 1 - Two consecutive vowels or two consecutive consonants occur relatively infrequently.
Search the word generated in your dictionary. If the word is found, do nothing. Otherwise, generate spell-correcting suggestions. There are four groups of words that are suggested.
- Group 1: Delete any one letter from the given word.
- Group 2: Add one letter at any position of the given word.
- Group 3: Change any single letter of the given word.
- Group 4: Swap any two letters of the given word.
Report all the suggested words that are present in the dictionary.
Destroy the Dictionary
After the spell-checking work is done, destroy the dictionary. This means that any dynamic memory allocated to store the dictionary is to be freed. This includes the memory associated with the char array in each node (field word in data type AVLtreenode), the memory associated with each node (allocated to the left or right pointer of some node), and also the memory associated with the array of seven tree headers (data type dictionary).
Sample Output
Present your output as in the following format.Height of D[2] = 9 Height of D[3] = 11 ... Height of D[8] = 22 Searching for word "SE" : Found Searching for word "ETEDOGE" : Not found Spell check suggestions: Group 1: Group 2: FETEDOGE Group 3: ESEDOGE ETEDAGE ETEDOCE ETEDOLE ETEDORE ETEDOSE ETEDOTE Group 4: EDETOGE ETODEGE ETEDEGO Searching for word "RUS" : Found Searching for word "OCUCU" : Not found Spell check suggestions: Group 1: CUCU Group 2: Group 3: Group 4: Searching for word "TWESOM" : Not found Spell check suggestions: Group 1: WESOM TESOM Group 2: Group 3: Group 4: Searching for word "RILISA" : Not found Spell check suggestions: Group 1: ILISA Group 2: Group 3: GILISA PILISA RIDISA RIFISA RIRISA RISISA RITISA RILINA RILIQA RILISF Group 4: SILIRA Searching for word "URARE" : Not found Spell check suggestions: Group 1: RARE Group 2: BURARE HURARE TURARE URARER URARES Group 3: ERARE IRARE UFARE UJARE UMARE UNARE UTARE URASE URATE URARA URARU Group 4: Searching for word "ELDIRIT" : Not found Spell check suggestions: Group 1: Group 2: Group 3: Group 4:Submission site