CS21003 Algorithms I CS29003 Algorithms Laboratory Autumn 2012, L-T-P: 3-1-0 L-T-P: 0-0-3

## Programming Assignment 8

You are given n points P1,P2,...,Pn in the unit square. Given a query point Q in the unit square, your task is to locate the point Pj closest to Q.

Under the assumption that the input points Pi are randomly distributed in the unit square, we can design a randomized algorithm that runs in constant expected time. The algorithm uses a two-dimensional hash table H. Let m = ceiling(sqrt(n)). H is an m x m array of pointers to linked lists. The unit square can be imagined to be broken up into an m x m grid of subsquares. The (s,t)-th subsquare corresponds to the hash table entry H(s,t). For each i, we locate the cell (s,t) in which the input point Pi lies, and insert the point Pi in the linked list headed by H(s,t). Under the randomness assumption about the input points, each cell is expected to head a linked list containing only a constant number of the input points.

Given a query point Q, we locate the cell (s,t) to which Q belongs. If the cell of Q contains one or more input points, we compute the minimum distance of Q from these points. If the cell of Q is empty, we find a nearby non-empty cell. In fact, for r = 0,1,2,..., we can search for a closest point in the subsquares (i,j) with |s - i| ≤ r and |t - j| ≤ r. We do not expect many non-empty cells in the grid, so such a non-empty neighboring cell for Q would be quickly located.

Unfortunately, this initial closest point found as above need not be closest to Q. In order that we locate the point closest to Q, the search should proceed along an expanding circle centered at Q. On the contrary, our search front grew in the shape of a square. However, by extending the search front to a slightly bigger square allows us to conclude that we have not accidentally missed a point closer to Q than the point identified in the initial search phase.

Write a program to do the following.

• Read n from the user. Generate n points in the unit square with random x and y coordinates. Insert the points in the two-dimensional hash table H.
• Report the statistics of hash table lists. For example, you may report the minimum and maximum size of a hashed list, and the number of lists containing eactly c entries for all c between these minimum and maximum values (see the sample output below).
• Randomly generate a query point Q in the unit square. Using the first phase of the search, find and report the potentially closest point to Q.
• Refine the search to encompass a larger square, and report the actual point closest to Q.

### Sample output

```   n = 10000
Minimum number of points in a cell = 0
Maximum number of points in a cell = 6
3684 cells with 0 points
3670 cells with 1 points
1825 cells with 2 points
642 cells with 3 points
146 cells with 4 points
28 cells with 5 points
5 cells with 6 points

+++ Search for point closest to (0.71762,0.76494)
Initial search gives    (0.71056,0.76644)
The correct result is   (0.72420,0.76436)

+++ Search for point closest to (0.38593,0.28638)
Initial search gives    (0.38338,0.28707)
The correct result is   (0.38338,0.28707)
```