CS13002 Programming and Data Structures

Spring semester

Introduction

What is a digital computer?

A computer is a machine that can perform computation. It is difficult to give a precise definition of computation. Intuitively, a computation involves the following three components:

Usually, computations are carried out to solve some meaningful and useful problems. One supplies some input instances for the problem, which are then analyzed in order to obtain the answers for the instances.

Types of problems

  1. Functional problems

    A set of arguments a1,a2,...,an constitute the input. Some function f(a1,a2,...,an) of the arguments is calculated and output to the user.

  2. Decision problems

    These form a special class of functional problems whose outputs are "yes" and "no" (or "true" and "false", or "1" and "0", etc).

  3. Search problems

    Given an input object, one tries to locate some particular configuration pertaining to the object and outputs the located configuration, or "failure" if no configuration can be located.

  4. Optimization problems

    Given an object, a configuration and a criterion for goodness, one finds and reports the configuration pertaining to the object, that is best with respect to the goodness criterion. If no such configuration is found, "failure" is to be reported.

Specific examples

  1. Polynomial root finding

    Category: Functional problem

    Input: A polynomial with real coefficients
    Output: One (or all) real roots of the input polynomial
    Processing: Usually, one involves a numerical method (like the Newton-Raphson method) for computing the real roots of a polynomial.

  2. Matrix inversion

    Category: Functional problem

    Input: A square matrix with rational entries
    Output: The inverse of the input matrix if it is invertible, or "failure"
    Processing: Gaussian elimination is a widely used method for matrix inversion. Other techniques may also be conceived of.

  3. Primality testing

    Category: Decision problem

    Input: A positive integer
    Output: The decision whether the input integer is prime or not
    Processing: For checking the primality of n, it is an obvious strategy to divide n by integers between 2 and square root of n. If a divisor of n is found, n is declared "composite" ("no"), else n is declared "prime" ("yes").

    This obvious strategy is, however, very slow. More practical primality testing algorithms are available. The first known (theoretically) fast algorithm is due to three Indians (Agarwal, Kayal and Saxena) from IIT Kanpur.

  4. Traveling salesman problem (TSP)

    Category: Optimization problem

    Input: A set of cities, the cost of traveling between each pair of cities, and the criterion of cost minimization
    Output: A route through all the cities with each city visited only once and with the total cost of travel as small as possible
    Processing: Since the total number of feasible routes for n cities is n!, a finite quantity, checking all routes to find the minimum is definitely a strategy to solve the TSP. However, n! grows very rapidly with n, and this brute-force search is impractical. We do not know efficient solutions for the TSP. One may, however, plan to remain happy with a suboptimal solution in which the total cost is not the smallest possible, but close to it.

  5. Weather prediction

    Category: Functional problem

    Input: Records of weather for previous days and years. Possibly also data from satellites.
    Output: Expected weather of Kharagpur for tomorrow
    Processing: One statistically processes and analyzes the available data and makes an educated extrapolating guess for tomorrow's weather.

  6. Web browsing

    Category: Functional problem

    Input: A URL (abbreviation for "Uniform Resource Locator" which is colloquially termed as "Internet site")
    Output: Display (audio and visual) of the file at the given URL
    Processing: Depending on the type of the file at the URL, one or more specific programs are run and the desired output is generated. For example, a web browser can render an HTML page, images in some formats etc. For displaying a movie, a separate software (or its plug-in) need be employed.

  7. Chess : Can I win?

    Category: Search problem

    Input: A configuration of the standard 8x8 chess board and the player ("white" or "black") who is going to move next
    Output: A winning move for the next player, if existent, or "failure"
    Processing: In general, finding a winning chess move from a given state is a very difficult problem. The trouble is that one may have to explore an infinite number of possibilities. Even when the total possibilities are finite in number, that number is so big that one cannot expect to complete exploration of all of these possibilities in a reasonable time. A more practical strategy is to investigate all possible board sequences involving a small number of moves starting from the given configuration and to identify the best sequence under some criterion and finally prescribe the first move in the best sequence.

A computer is a device that can solve these and similar problems. A digital computer accepts, processes and outputs data in digitized forms (as opposed to analog forms).

The basic components of a digital computer

In order that a digital computer can solve problems, it should be equipped with the following components:

The interconnection diagram for a simple computer is shown in the figure below. This architecture is commonly called the John von Neumann architecture after its discoverer who was the first to give a concrete idea of stored program computers. Surprisingly enough, the idea of computation (together with a rich theory behind it) was proposed several decades earlier than the first real computer is manufactured. John von Neumann proposed the first usable draft of a working computer.

computer organization

Figure : The John von Neumann architecture

How does a program run in a computer?

The inputs, the intermediate values and the instructions defining the processing stage reside in the (main) memory. In order to separate data from instructions the memory is divided into two parts:

The fetch-decode-execute cycle works as follows:

  1. For starting the execution of a program, a sequence of machine instructions is copied to the instruction area of the memory. Also some global variables and input parameters are copied to the data area of the memory.

  2. A particular control register, called the program counter (PC), is loaded with the address of the first instruction of the program.

  3. The CPU fetches the instruction from that location in the memory that is currently stored in the PC register.

  4. The instruction is decoded in the control unit of the CPU.

  5. The instruction may require one or more operands. An operand may be either a data or a memory address. A data may be either a constant (also called an immediate operand) or a value stored in the data area of the memory or a value stored in a register. Similarly, an address may be either immediate or a resident of the main memory or available in a register.

  6. An immediate operand is available from the instruction itself. The content of a register is also available at the time of the execution of the instruction. Finally, a variable value is fetched from the data part of the main memory.

  7. If the instruction is a data movement operation, the corresponding movement is performed. For example, a "load" instruction copies the data fetched from memory to a register, whereas a "save" instruction sends a value from a register to the data area of the memory.

  8. If the instruction is an arithmetic or logical instruction, it is executed in the ALU after all the operands are available in the CPU (in its registers). The output from the ALU is stored back in a register.

  9. If the instruction is a jump instruction, the instruction must contain a memory address to jump to. The program counter (PC) is loaded with this address. A jump may be conditional, i.e., the PC is loaded with the new address if and only if some condition(s) is/are true.

  10. If the instruction is not a jump instruction, the address stored in the PC is incremented by one.

  11. If the end of the program is not reached, the CPU goes to Step 3 and continues its fetch-decode-execute cycle.
CPU-memory interaction

Figure : Execution of a program

Why need one program?

The electronic speed possessed by computers for processing data is really fabulous. Can you imagine a human prodigy manually multiplying two thousand digit integers flawlessly in an hour? A computer can perform that multiplication so fast that you even do not perceive that it has taken any time at all. It is wise to exploit this instrument to the best of our benefit. Why not, right?

However, there are many programs already written by professionals and amateurs. Why need we bother about writing programs ourselves? If we have to find roots of a polynomial or invert/multiply matrices or check primality of natural numbers, we can use standard mathematical packages and libraries. If we want to do web browsing, it is not a practical strategy that everyone writes his/her own browser. It is reported that playing chess with the computer could be a really exciting experience, even to world champions like Kasparov. Why should we write our own chess programs then? Thanks to the free (and open-source) software movement, many useful programs are now available in the public domain (often free of cost).

Still, we have to write programs ourselves! Here are some compelling reasons:

How can one program?

Given a problem at hand, you tell the computer how to solve it and the machine does it. Unfortunately, telling the computer your processing steps is not that easy. Computers can be communicated with only in the language that they understand and are quite stubborn about that.

You have to specify the exact way in which the fetch-decode-execute cycle is to be carried out so that your problem is solved. The CPU of a computer supports a primitive set of instructions (typically, data movement, arithmetic, logical and jump instructions). Writing a program using these instructions (called assembly instructions) has two major drawbacks:

A high-level language helps you make your communication with computers more abstract and simpler and also widely machine-independent. You then require computer programs that convert your high-level description to the assembly-level descriptions of individual machines, one program for each kind of CPU. Such a translator is called a compiler.

Therefore, your problem solving with computers involves the following three steps:

  1. Write the program in a high-level language

    You should use a text editor to key in your program. In the laboratory we instruct you to use the emacs editor. You should also save your program in a (named) file.

    We are going to teach you the high-level language known as C.

  2. Compile your program

    You need a compiler to do that. In the lab you should use the C compiler cc (a.k.a. gcc).

       cc myprog.c
    
    If your program compiles successfully, a file named a.out (an abbreviation of "assembler output") is created. This file stores the machine instructions that can be understood by the particular computer where you invoked the compiler.

    If compilation fails, you should check your source code. The reason of the failure is that you made one or more mistakes in specifying your idea. Compilers are very stubborn about the syntax of your code. Even a single mistake may let the compiler churn out many angry messages.

  3. Run the machine executable file

    This is done by typing

       ./a.out
    
    (and then hitting the return/enter button) at the command prompt.

Your first C programs


Course home