Programming Assignment 3

Realization of a Spell Checker Using AVL Trees

In this assignment, you prepare height-balanced trees in order to store a dictionary of words. Assume that we are dealing with a language called Funglish for which a word list is provided. For simplicity, we assume that Funglish has only two- to eight-letter words. The word-list file contains these words in a random order and with repetitions. Funglish is case-insensitive, so you do not need to worry about distinguishing between lower and upper cases. Work consistently with upper-case letters.

Creating the AVL trees

Read the word list from the supplied file, and for each i in the range [2,8], store all i-letter words in an AVL tree Ti. Your dictionary is stored as a collection of the seven trees T2, T3, ..., T8. You may use the following data types to store the dictionary.

   typedef struct _treenode {
      char *word;
      struct _treenode *left;
      struct _treenode *right;
      struct _treenode *parent;
      int balancefactor;
   } AVLtreenode;

   typedef AVLtreenode *AVLtree;

   typedef AVLtree *dictionary;

This is a kind of hashed AVL forest, where an initial hashing is done with respect to the lengths of the words. Initially, create seven empty trees. Subsequently, as you read the word-list file, insert each word in the appropriate tree based upon the length of the word. Use a single function for insertion in an AVL tree.

This is a static dictionary, and sorted arrays could have served our purpose. But this exercise is meant for playing with height-balanced trees. In this exercise, you never delete any word from the dictionary, so there is no need to write the deletion function for AVL trees.

After the entire word-list is read, compute and report the heights of all the seven trees.

Using the following command, you can sort the word-list file, remove duplicates from it, and store the list in a file wordlist-sorted-nodup.txt.

   $ sort wordlist.txt | uniq > wordlist-sorted-nodup.txt

Build the seven trees using this processed word-list file, and report the heights of the seven trees. It is very interesting to note that if the items input to an AVL tree appear in the sorted order, the result is a perfectly balanced binary search tree. Try to prove this assertion as a theoretical exercise (not for submission).

How to do File I/O in C

String functions in C

Spell Checking

Generate a random (purportedly Funglish) word with the following constraints.

Search the word generated in your dictionary. If the word is found, do nothing. Otherwise, generate spell-correcting suggestions. There are four groups of words that are suggested.

Report all the suggested words that are present in the dictionary.

Destroy the Dictionary

After the spell-checking work is done, destroy the dictionary. This means that any dynamic memory allocated to store the dictionary is to be freed. This includes the memory associated with the char array in each node (field word in data type AVLtreenode), the memory associated with each node (allocated to the left or right pointer of some node), and also the memory associated with the array of seven tree headers (data type dictionary).

Sample Output

Present your output as in the following format.
   Height of D[2] = 9
   Height of D[3] = 11
   ...
   Height of D[8] = 22

   Searching for word "SE"       : Found

   Searching for word "ETEDOGE"  : Not found
   Spell check suggestions:
           Group 1: 
           Group 2: FETEDOGE 
           Group 3: ESEDOGE ETEDAGE ETEDOCE ETEDOLE ETEDORE ETEDOSE ETEDOTE 
           Group 4: EDETOGE ETODEGE ETEDEGO 

   Searching for word "RUS"      : Found

   Searching for word "OCUCU"    : Not found
   Spell check suggestions:
           Group 1: CUCU 
           Group 2: 
           Group 3: 
           Group 4: 

   Searching for word "TWESOM"   : Not found
   Spell check suggestions:
           Group 1: WESOM TESOM 
           Group 2: 
           Group 3: 
           Group 4: 

   Searching for word "RILISA"   : Not found
   Spell check suggestions:
           Group 1: ILISA 
           Group 2: 
           Group 3: GILISA PILISA RIDISA RIFISA RIRISA RISISA RITISA RILINA RILIQA RILISF 
           Group 4: SILIRA 

   Searching for word "URARE"    : Not found
   Spell check suggestions:
           Group 1: RARE 
           Group 2: BURARE HURARE TURARE URARER URARES 
           Group 3: ERARE IRARE UFARE UJARE UMARE UNARE UTARE URASE URATE URARA URARU 
           Group 4: 

   Searching for word "ELDIRIT"  : Not found
   Spell check suggestions:
           Group 1: 
           Group 2: 
           Group 3: 
           Group 4: 

Submission site


Back | Home