COMMON MISTAKES --------------- -------------------------------------------------------------------------------- Q1, Q2, Q3 corrected by AH -------------------------------------------------------------------------------- Q1 -- Algorithm Not Possible with O(m+n) time [-10] Comment: Totally wrong! Done using Linear-time sorting algorithm [-5] Comment: Not correct since you do not know the range of key values stored in BST. Some segmentation fault due to bad pointer manipulation [-2] Final tree not constructed [-3] Algorithm correct, but is not O(m+n) time [-5] No merge after array creation from T1 and T2 keys [-4] Creation of sorted array of keys from T1 and T2 using [-2] other than inorder traversal Comment: Wrong as it does not sort data Bad Space utilization because of use of auxiliary sorted [-0] arrays from T1 and T2 and merging of them into another array Comment: No need of additional arrays. NO marks deducted though! Q2(a) ----- Done using Linear-time sorting algorithm [-2] Comment: Not correct since you do not know the range of integer values stored in array. The sorting algo name is not mentioned or wrong sorting [-1] algorithm (of higher than O(nlogn)) presented O(n²) algorithm given [-4] Used counter counts for one occurrence less than actual [-1] Q2(b) ----- Done using Linear-time sorting algorithm [-4] Comment: Not correct since you do not know the range of integer values stored in array. More than O(n) expected time algorithm [-6] Improper hashing process and errors! [-3] Q3(a) ----- Only True without explanation [-5] Formal proof missing or incomplete [-2] Sketchy proof procedure [-3] Q3(b) ----- Only False without counterexample [-5] Counterexample incomplete [-2] Example graph correct but counterexample incorrect [-3] -------------------------------------------------------------------------------- Q4, Q5, Q6 corrected by AD -------------------------------------------------------------------------------- Many penalties to errors have ranges (the exact marks deducted depend on the severity and frequency of the errors). -------------------------------------------------------------------------------- Q4 -- Attempt/Mention of finding all cycles in G [-8] Comment: The graph may contain an exponential number of cycles. For example, in a complete graph, any choice of >= 3 vertices gives a cycle. So the number of cycles is C(n,3)/3 + C(n,4)/4 + ... + C(n,n)/n which is approximately no smaller than 2^n/n. No polynomial-time algorithm (let alone BFS or DFS) can find all cycles in a graph. BFS or DFS can detect the presence or absence of cycles in a graph. That's all. Errors include finding common edges (of highest costs) in cycles, removing highest-cost edges from cycles, and many other variants. Also, note that if you remove edges from a BFS/DFS tree (to break cycles), the tree changes, and in future, you may detect spurious cycles and/or miss actual cycles. Incorrect running time (in each part) [-1] Comment: Use of disjoint-set data structure in Part (a) does not lead to O(|E|) running time. In this part, you do not need to sort, but findset cannot be done in constant time. Even if you use path compression, the running time is O(|E|α(|V|)) which is technically not the same as O(|E|). Disjoint-set data structure in Part (a) [-1] Comment: Inefficient. A BFS/DFS runs faster. BFS/DFS in Part (b) [-4] Comment: Will not work. Again there may be an exponential number of BFS/DFS trees (depending on from which vertex you start the exploration, and on choosing the order of neighbors of each vertex). Moreover, a maximum spanning tree (see solution) need not be a BFS or DFS tree. For example, consider the complete graph on n vertices. Then, each BFS tree has a particular form (the root having n-1 children), whereas each DFS tree is also of another particular form (a path of length n-1). We can however adjust the edge weights so that the maximum spanning tree is of neither of these two forms. Add edge to F for every visited neighbor [0] Comment: This is technically incorrect, because the visited neighbor may be the parent of the current node in the DFS/BFS tree. Anyway, this minor oversight is not penalized. Q5 -- The marking scheme adopted was as follows: init: Steps 1-4 of solution 5 marks update: one iteration of FW (Step 5 of solution) 5 marks Problems in init -[2-3] Comments: This includes wrong formulas, using uninitialized D'[w][v] or D'[v][w] entries, and so on. Using Dijkstra in init -[2-3] Comments: The running time is O(|E| log |V|), but you are allowed O(|V|²) running time only. Also note that you need two invocations of Dijkstra, one for single source and the other for single destination. Reversing the order of init and update [-4] Setting D'[u][v] = D'[v][u] [-4] Comment: This exercise deals with directed graphs. Insufficient/Missing updating formulas -[6-8] Comment: Just mentioning in plain English which values in D' need to be updated will not do. You must explicitly write all the updating formulas. Q6 -- Use of an O(n²)-time algorithm [like naive] -[7-8] Use of Rabin--Karp or a hash function -[7-8] Comment: Rabin--Karp uses a hash function that uses a modulus. This introduces hash-function collisions. In presence of collisions, you need to make a symbol-by-symbol match before concluding equality of two strings. Many changed the Rabin--Karp algorithm so that no modular reduction is made. Now, there is no collision, and equality checking of the hash values will do. Suppose that the strings are C/C++ strings, so we are talking about the ASCII alphabet whose size is 2⁸ = 256. If you encode T (an m-bit string) as an integer, you need 8m bits. Even for a small m like 1000, this needs 8000 bits. You cannot store this integer in a built-in int data type. More importantly, the assumption that two such big integers can be compared or hashes can be updated in the Rabin--Karp way in constant time is quite invalid. Please note that Rabin--Karp uses the modular reduction just to avoid this problem. Skipping by m = |T| positions in S after a match -[6-8] Comment: This was done in naive, Rabin--Karp, and updated KMP by many. But this idea will not work. Take T = aba and S = ababaaba. After the first match, you skip three positions, and fail to get another immediate match. Later, you find another match (at the end), but r is still reported as one. But if you set j = F[j-1] as in KMP, you would have detected the occurrence abaaba in S. But by doing so, you lose the information about the previous match. Moreover, if you want to find this copy of T² in S, you should ignore j = F[j-1] after the first match, and start the search all over again. So which one will you do (j = F[j-1] or j = 0), and when? In short, no straightforward updating of KMP will work. Use of an O(n²/m)-time algorithm -[7-8] Comment: This usually includes running KMP on S,T^r for r = 1,2,3, ..., n/m. Minor logical bugs in handling two m-separated matches after KMP is called (without modification) -[1-2] GENERAL ------- Do not express your algorithms in plain verbose English. That is often too imprecise and even illegible. Writing full-fledged C/C++ code is good for compilers, but may be not so for human readers. Write appropriate pseudocodes (formal algorithms).