Student Projects
S. Manoharan
Department of Computer Science
The following are some of the student projects on offer.
However, if you are keen on a particular project in the
research area
I am interested in, then feel free to discuss it
with me.
- High-performance Computer Architectures
- Hardware-based Cache Prefetching Models.
- Program Analysis to Predict Memory Access Patterns.
- A Pentium simulator.
- Parallel and Distributed Systems
- Parallel Solutions to Optimization Problems.
- Assignment of Program Structures onto Parallel Processors.
- Task Duplication.
- Media Processing
- Steganography for media files.
- Robust steganography and water-marking of media files.
- Internet Programming
- An Online Defect Tracker. A database to track defects.
- Web-based report management.
- Hardware-based Cache Prefetching Models.
-
Background.
In high-performance microprocessor systems, the penalties for
cache misses can be high. A typical miss can cause a processor to
run anywhere from 20 to hundreds of times slower. Prefetching
into the cache may reduce the number of cache misses for the
typical cache-unfriendly applications.
Software-based prefetching suffers from the fact that it ties
valuable processor time predicting what data need to prefetched and
fetching them.
The advantage of a hardware-based prefetching is that it may
operate concurrently with the processor.
Aim.
The aim of this project is to model a hardware-based prefetch system
where in addition to the processor (the execution engine), a prefetch
engine interacts with the cache hierarchy.
Both the execution and prefetch engines run the same program
thread but with different objectives.
The execution engine
follows the usual fetch-operands and execute cycle. The prefetch
engine does not execute; if simply prefetches the operands into the
cache.
The prefetch engine therefore runs ahead of the execution engine,
prefetching the data that the execution engine may require.
There are details to work out. For example, how would the prefetch
engine know which control path the execution engine is to take?
And what happens if the prefetch engine runs too far ahead
of the execution engine?
The project will develop a simulation model of a system consisting
of an execution engine, a prefetch engine, and a cache hierarchy.
The model will then be used to study the system and to work out the
parameters governing the details of the system.
- Program Analysis to Predict Memory Access Patterns.
-
Background.
In high-performance microprocessor systems, the penalties for
cache misses can be high. A typical miss can cause a processor to
run anywhere from 20 to hundreds of times slower.
Programs that do not exhibit enough spatial and temporal localities
suffer badly.
If the memory access patterns of such programs can be predicted,
then prefetching those memory locations into the cache will reduce
the number of cache misses.
Aim.
This project requires analysing the memory access
patterns of a number of large benchmarks (e.g. Livermore
loop kernels, NAS, Linpack, etc.).
It will identify the possible prefetch
blocks (blocks that are likely to be used in the near future)
in these benchmarks and instrument them to mark such
blocks. If the prefetch blocks are identified correctly,
the data can be prefetched into the cache, thereby providing a
two to threefold decrease in the execution time of
typical cache-unfriendly application programs.
- Assignment of Program Structures onto Parallel Processors.
-
Background.
A program with multiple tasks can be
viewed as a dependency graph:
the vertices represent the tasks and the edges represent the
dependencies between the tasks.
An assignment divides the task set T into m,
some possibly empty, ordered subsets or partitions. Here m
is the number of processors available.
The total time the set of tasks T takes to execute on the set of
processors P is called the makespan.
The objective of the assignment is to minimize the makespan.
A naive approach to solve the assignment problem is to
enumerate all the possible assignments and choose the assignment
that gives the minimum makespan. However,
this approach will take exponential time.
It is very unlikely that there
would be any cleverer scheme to find the optimal assignment in
polynomial time, since even the restricted cases of the assignment
problem have been proved to be NP-complete.
Practical assignment schemes thus settle for some heuristics
that would find sub-optimal assignments in polynomial
time.
Aim.
The aim of this project is to use some of the combinatorial
optimization schemes such as Genetic Algorithms, Simulated Annealing,
and Tabu Search to find sub-optimal solutions to the
assignment problem. The project will use some typical dependency
graphs taken from the basic blocks of programs and some
randomly-generated dependency graphs. It will compare the
optimization schemes in the light of their applicability to the
assignment problem.
The programs implementing the optimization schemes and the objects
modelling the tasks, task graphs, and the processor graphs will be
developed in C++.
- A Pentium Simulator.
-
Background.
A CPU simulator mimics the behaviour of a CPU, under a given set
of assumptions.
Building a simulation model of a processor enables one to
understand the processor architecture with a practical dimension.
Aim.
The aim of this project is to build an instruction-level simulator
for a Penitium microprocessor. The simulator will be able to execute
binary files created by standard compilation systems on a Pentium PC
running Linux. Such a simulator is useful to study the
behaviour of the processor and the memory subsystem under real
program loadings. It will also be useful to experiment with
architectural improvements.
- Parallel Solutions to Optimization Problems.
-
Background.
Given N cities, what is the shortest
closed tour in which each city can be visited once?
This is known as the travelling salesman problem (TSP).
This is is a classical problem in the field of
combinatorial optimization, concerned with efficient methods for
maximizing or minimizing a function of many independent variables.
All exact methods known for determining an optimal route of
TSP require a computing effort that increases exponentially
with number of cities, so in practice exact solutions can be
attempted only on problems involving a few hundred cities or less.
TSP belongs to the large class of nondeterministic polynomial
time complete problems.
Practical methods to solve large-scale TSP thus settle for some
schemes such as Genetic Algorithms, Simulated Annealing,
and Tabu Search to find sub-optimal solutions within an
acceptable time.
Aim.
The aim of this project is to get parallel implementations to
some of the combinatorial
optimization schemes such as Genetic Algorithms, Simulated Annealing,
and Tabu Search.
- Task Duplication.
-
Background. Is it advantageous to duplicate a job?
To duplicate a job requires extra resources, but if there
are other critical jobs that depend on the completions of this job,
then the duplication may be advantageous.
Aim.
Scheduling jobs for efficiency is a classical problem that one
encounters in many fields.
This project aims to study the conditions under which
job duplication is advantageous, and to design and implement
algorithms that duplicate jobs.
- Steganography for Media files.
-
Background. Steganography is the art of hiding information
in ways that prevent easy detection of hidden messages.
While cryptography scrambles messages so that they cannot be
understood, steganography hides messages so that they cannot be
seen. Messages can be hidden in any media. E.g. a text message
can be hidden in a bitmap or raster image; a jpeg file can be
hidden in an mp3 sound file.
Digital watermarking, where the hidden message is a watermark
of the original author of the media file, is a field of growing
importance.
A demonstration of embedding messages in bitmaps can be found
at
http://www.cs.auckland.ac.nz/~mano/steg/.
Aim.
The aim of this project is to study various media file formats
and implement a steganography tool that allows hiding messages
in these media formats. The media formats may include mp3, mpeg,
wav, and jpeg.
- An Online Defect Tracker.
-
Background.
A product lifecycle typically consists of design,
development, testing,
and maintenance. Defects of various sorts get introduced into the
product at any of these stages. It is important to keep track of the
defects detected and their status throughout the lifecycle of the
product. This provides the product developer with a written history
of the lifecycle. Typically, this history is useful for
debugging, testing, product assessment, risk analysis, feasibility
studies, etc.
Aim.
The aim of this project is to develop a database
to track the defects, and to design a web-interface for user
interaction.
Contact me if you like to discuss a project.