Computer Science


Assignments

Assignment #1 Due: Tuesday 29 Apr 08, in class

You should do this assignment alone. Feel free to use any sources of information you can find--the goal is to learn as much as you can about parallel computing.

You may work with others, but the algorithms and programming should be your work.

The purpose of this assignment is to give you an opportunity to experience writing simple programs using shared memory.

This experience will provide a foundation for writing and/or running much more complex programs.

You may use any machine available to you that has at least four distinct processors. One such machine you may use is wintermute03.cs.auckland.ac.nz, a computer that has four cores, each with two strands (hardware threads).

You must use this machine only for 703 homework assignments or, with instructor's permission, for your course project. You have many choices in doing this assignment. You may use any language and any tools available on this computer.


Assignment 1 (5% of grade)

Write a shared-memory program that reads from a file a list of double-precision floating point numbers and rearranges them. It then writes to the standard output(the screen, most probably) the largest, smallest, and median values. Your program should operate in parallel with P processors specified on the command line. You may want to test your program with smaller files, but the benchmark data consists of 50 million double-precision (8 byte) floating point numbers stored in binary format in a file to be made available. The data is little-endian (since the processors are Intel Xeons) so be careful to convert if moving to a big-endian machine. As a sanity check the first element in the array is: 0.0108401219279

You can find the file at: /var/tmp/floats

Do not try to copy the file--it is 400 MBytes! In addition to printing the three values, you should also print out (a) the wall-clock time from when the file is opened until it is assembled in main memory as a single array, and (b) the wall-clock time from when the array is filled until the three values have been determined (but not yet printed). You may want to run the experiments several times, because times may vary based on other demand on the system. You should report only the best time.

The assignment is to determine from the data the three values of minimum, maximum, and median. There is a linear-time algorithm for determining median, and you are permitted (but not encouraged) to use this algorithm instead of sorting. You need not do a complete sort if you can determine these three numbers from the data by some other means.

As another sanity check, you could try your program on the file floats.small. This one has the same distribution as the other file but is only 4MBs. The stats for this file are as follows:-

first 0.0201262799209
min 5.13395136603e-22
max 6.77587867453e+21
median 1.0026277229
length 500000

Important: this isn't the assignment file. Use it as a sanity check only if you want to.

What to Hand In

Turn this homework in on paper in class at the beginning of lecture on the due date (Tuesday 6May08) - This is the new date and it includes the extension. No late assignments will be accepted.

Describe the algorithm you used, how you parallelized the program and why you believe this is the best method. Provide printouts of your program using 1, 2, & 4 processors. These runs should use processor affinity to make sure that the strands are running on different processors. In addition, you should run the program with 8 strands, the full complement available on this processor.

Printout should include times, number of processors, and the answers. (M is the maximum number of processors/hardware threads you are able to use, ideally 16). Report some other result you have obtained developing and testing the program, e.g., the time for sorting a range of file sizes, or the results of sorting a partially sorted file. Explain why the result is interesting.

As an appendix, you should attach source code for all experiments you reported.

Hints

1. Use the following for CPU/Memory wintermute03 stats:

cat /proc/cpuinfo # CPUs/cores and their numbers

free -m # Total and free memory (in mbs)

top # Whether others might be using it

2. wintermute03 has four processors, each with two strands (hardware threads). Two strands running on a single processor never run twice as fast as a single strand. You should explore the difference, for example, between two strands running on the same processor and two strands running on different processors.

3. Things to google:
- General: cpu affinity, selection algorithm, quicksort
- Python: Numeric, python affinity 0.1.0
- Linux: linux, Compsci 215 Lectures

If you have any questions, please feel free to contact Fuad (fuad at cs.auckland.....)


Apply now!


Handbook

Postgraduate study options

Computer Science Blog



Please give us your feedback or ask us a question

This message is...


My feedback or question is...


My email address is...

(Only if you need a reply)

A to Z Directory | Site map | Accessibility | Copyright | Privacy | Disclaimer | Feedback on this page