COMMON MISTAKES
---------------
--------------------------------------------------------------------------------
Q1, Q2, Q3 corrected by AH
--------------------------------------------------------------------------------
Q1
--
Algorithm Not Possible with O(m+n) time [-10]
Comment: Totally wrong!
Done using Linear-time sorting algorithm [-5]
Comment: Not correct since you do not know the range of
key values stored in BST.
Some segmentation fault due to bad pointer manipulation [-2]
Final tree not constructed [-3]
Algorithm correct, but is not O(m+n) time [-5]
No merge after array creation from T1 and T2 keys [-4]
Creation of sorted array of keys from T1 and T2 using [-2]
other than inorder traversal
Comment: Wrong as it does not sort data
Bad Space utilization because of use of auxiliary sorted [-0]
arrays from T1 and T2 and merging of them into another
array
Comment: No need of additional arrays. NO marks deducted though!
Q2(a)
-----
Done using Linear-time sorting algorithm [-2]
Comment: Not correct since you do not know the range of
integer values stored in array.
The sorting algo name is not mentioned or wrong sorting [-1]
algorithm (of higher than O(nlogn)) presented
O(n²) algorithm given [-4]
Used counter counts for one occurrence less than actual [-1]
Q2(b)
-----
Done using Linear-time sorting algorithm [-4]
Comment: Not correct since you do not know the range of
integer values stored in array.
More than O(n) expected time algorithm [-6]
Improper hashing process and errors! [-3]
Q3(a)
-----
Only True without explanation [-5]
Formal proof missing or incomplete [-2]
Sketchy proof procedure [-3]
Q3(b)
-----
Only False without counterexample [-5]
Counterexample incomplete [-2]
Example graph correct but counterexample incorrect [-3]
--------------------------------------------------------------------------------
Q4, Q5, Q6 corrected by AD
--------------------------------------------------------------------------------
Many penalties to errors have ranges (the exact marks deducted depend on the
severity and frequency of the errors).
--------------------------------------------------------------------------------
Q4
--
Attempt/Mention of finding all cycles in G [-8]
Comment: The graph may contain an exponential number of
cycles. For example, in a complete graph, any
choice of >= 3 vertices gives a cycle. So the
number of cycles is C(n,3)/3 + C(n,4)/4 + ... +
C(n,n)/n which is approximately no smaller than
2^n/n. No polynomial-time algorithm (let alone
BFS or DFS) can find all cycles in a graph.
BFS or DFS can detect the presence or absence
of cycles in a graph. That's all.
Errors include finding common edges (of highest
costs) in cycles, removing highest-cost edges from
cycles, and many other variants. Also, note that
if you remove edges from a BFS/DFS tree (to break
cycles), the tree changes, and in future, you may
detect spurious cycles and/or miss actual cycles.
Incorrect running time (in each part) [-1]
Comment: Use of disjoint-set data structure in Part (a)
does not lead to O(|E|) running time. In this
part, you do not need to sort, but findset cannot
be done in constant time. Even if you use path
compression, the running time is O(|E|α(|V|))
which is technically not the same as O(|E|).
Disjoint-set data structure in Part (a) [-1]
Comment: Inefficient. A BFS/DFS runs faster.
BFS/DFS in Part (b) [-4]
Comment: Will not work. Again there may be an exponential
number of BFS/DFS trees (depending on from which
vertex you start the exploration, and on choosing
the order of neighbors of each vertex). Moreover,
a maximum spanning tree (see solution) need not be
a BFS or DFS tree. For example, consider the complete
graph on n vertices. Then, each BFS tree has a
particular form (the root having n-1 children),
whereas each DFS tree is also of another particular
form (a path of length n-1). We can however adjust
the edge weights so that the maximum spanning tree
is of neither of these two forms.
Add edge to F for every visited neighbor [0]
Comment: This is technically incorrect, because the visited
neighbor may be the parent of the current node in
the DFS/BFS tree. Anyway, this minor oversight is
not penalized.
Q5
--
The marking scheme adopted was as follows:
init: Steps 1-4 of solution 5 marks
update: one iteration of FW (Step 5 of solution) 5 marks
Problems in init -[2-3]
Comments: This includes wrong formulas, using uninitialized
D'[w][v] or D'[v][w] entries, and so on.
Using Dijkstra in init -[2-3]
Comments: The running time is O(|E| log |V|), but you
are allowed O(|V|²) running time only. Also
note that you need two invocations of Dijkstra,
one for single source and the other for single
destination.
Reversing the order of init and update [-4]
Setting D'[u][v] = D'[v][u] [-4]
Comment: This exercise deals with directed graphs.
Insufficient/Missing updating formulas -[6-8]
Comment: Just mentioning in plain English which values in D'
need to be updated will not do. You must explicitly
write all the updating formulas.
Q6
--
Use of an O(n²)-time algorithm [like naive] -[7-8]
Use of Rabin--Karp or a hash function -[7-8]
Comment: Rabin--Karp uses a hash function that uses a
modulus. This introduces hash-function collisions.
In presence of collisions, you need to make a
symbol-by-symbol match before concluding equality
of two strings.
Many changed the Rabin--Karp algorithm so that no
modular reduction is made. Now, there is no collision,
and equality checking of the hash values will do.
Suppose that the strings are C/C++ strings, so we
are talking about the ASCII alphabet whose size is
2⁸ = 256. If you encode T (an m-bit string) as an
integer, you need 8m bits. Even for a small m like
1000, this needs 8000 bits. You cannot store this
integer in a built-in int data type. More importantly,
the assumption that two such big integers can be
compared or hashes can be updated in the Rabin--Karp
way in constant time is quite invalid.
Please note that Rabin--Karp uses the modular
reduction just to avoid this problem.
Skipping by m = |T| positions in S after a match -[6-8]
Comment: This was done in naive, Rabin--Karp, and updated
KMP by many. But this idea will not work. Take
T = aba and S = ababaaba. After the first match,
you skip three positions, and fail to get another
immediate match. Later, you find another match (at
the end), but r is still reported as one. But
if you set j = F[j-1] as in KMP, you would have
detected the occurrence abaaba in S. But by doing
so, you lose the information about the previous
match. Moreover, if you want to find this copy of T²
in S, you should ignore j = F[j-1] after the first
match, and start the search all over again. So
which one will you do (j = F[j-1] or j = 0), and
when? In short, no straightforward updating of KMP
will work.
Use of an O(n²/m)-time algorithm -[7-8]
Comment: This usually includes running KMP on S,T^r for
r = 1,2,3, ..., n/m.
Minor logical bugs in handling two m-separated matches
after KMP is called (without modification) -[1-2]
GENERAL
-------
Do not express your algorithms in plain verbose English.
That is often too imprecise and even illegible. Writing
full-fledged C/C++ code is good for compilers, but may
be not so for human readers. Write appropriate pseudocodes
(formal algorithms).