A key aspect of computer science is concerned with trying to solve real world problems by a process of abstraction and representation. Typically, relevant features of the real world problem are first abstracted and then represented in natural language and in a sequence of increasingly formal forms of representation in order to implement the problem solution on a computer.
When a software system is developed, this process generates certain forms of written documentation, many in natural language (as opposed to a formal language). Computer science undergraduate courses usually seek to teach the process of software development and, to a greater or lesser extent, the communication skills needed. So computer scientists need to define and clarify the (real world) problem to be solved, and develop, document, test and evaluate the (software) solution. At each stage, they may need to explain the process and the product both to fellow professionals and to software users.
Computer science students need not only the ability to master the skills involved in writing technical reports and dissertations, but also the ability to identify certain features of the real world and describe them in natural language as a first stage to representing them in more precise formal diagrams and formal language descriptions until they are represented in such a sufficiently precise, non-ambiguous and consistent form that they can be successfully encoded into a computer.
A computer science degree course is generally thought of as vocational and at the core is usually the process of software development. As a result, the driving force behind much of the writing that Computer Science students do is the process of abstraction and representation. We look at what this involves, what writing tasks the software development process generates for computer professionals and what writing students may need to do in preparation for this.
Much of what we will put forward in this paper is based on our experience of teaching undergraduate and post-graduate computer scientists including a large number of international students, from Europe and particularly the Far East. These students may enter at various points of a degree programme, depending on previous qualifications. The largest number enter the final year of the undergraduate degree, where they characteristically work very hard and are largely successful in their studies. The demands made of them in this final year include the need to work independently in a flexible manner and to write a substantial project report on the design and development of a piece of software.
The rest of this paper is in two main sections: one dealing with writing tasks during the software development process, and the other with the consequent need for certain types of writing task on computer science courses.
Computer Science often deals with problems by the process of abstracting relevant features of a real world situation in order to construct a model to solve the problem.
The model is represented in increasingly more precise and structured form in a series of transformations to remove ambiguity and inconsistency until it is sufficiently precise for the computer to be able to process it. This process may start off with the model first described in natural language, then maybe in structured diagrams and a formal language and finally in a programming language which can be implemented on a computer. This process is often made easier for us and more accurate by proceeding in structured stages from the (natural) language humans can easily process to the (programming) language machines can process.
For instance, students in their first year of programming might be asked to solve a real world problem to count the number of words in any typed sentence. A student would need to extract and define the relevant features, which may be very different from the way the features might be defined if a human and not a computer were going to solve the problem. The abstracted features might be written as follows:
A sentence is defined as a collection of words and spaces ending with a full stop.
A word is defined as a collection of characters between two spaces.
An initial design would be evolved using these features and might first be formulated in natural language:
Wordnumber calculates the number of words in a sentence. It reads each character in a sentence one by one. Each time the next character is a space, the word number tally is increased by one. When a full-stop is reached, the tally is again increased by one (for the last word) and the program stops.
For example, the program is designed to start at the first letter, or character, of a sentence and move character by character until it reaches a space, when it will record that the number of words up to this point is 1. It will then proceed, counting each space, until it reaches a full stop.
This design is not in a sufficiently structured form for a computer yet. A similar problem was set on an introductory OU course (Open University, 1988) and next expressed in a more formal way, as a form of pseudocode, and then followed by coding in the programming language PASCAL. Here is the process of transformation using just part of the model. Notice that the formal language has to spell out each step whereas the natural language version can leave us to assume that if the next character is not a space, then we just move on. This then has to be translated into a programming code which can be read by the machine, which cannot do anything unless it is precisely and exactly told what to do.
Natural language form (for space reasons, only part of the program is considered here):
Each time the next character is a space, the word number tally is increased by one.
More formal language:
if character = space
then
update wordnumber by 1
read in character
else
read in character
endif
Programming language:
if character = ' ' thenbeginwordnumber := wordnumber + 1;read(character)endelsebeginread(character)end;
The process is not yet finished. The model must be tested to see if it really does solve the problem. Here are some results from typing in various 'sentences' for the program to count the number of words.
|
Sentence |
Number of Words |
|---|---|
|
The cat sat on the mat. |
6 |
|
Jean had a birthday. |
4 |
|
O gato sentou no tapête. |
5 |
So far the program seems to work well. It can even cope with Portuguese as well as English. Beginning students usually stop there, but if they were to continue they would discover that not all is well.
|
Sentence |
Number of Words |
|---|---|
|
qfklp. |
1 |
|
Thecatsatonthe mat. |
2 |
|
Did the cat sit on the mat? Yes. |
8 |
|
The cat sat on the mat. |
11 |
We have told the program to accept a sentence as a collection of characters ending in a full stop, and a word as a collection of characters between spaces. As a result, the program cannot tell the difference between nonsense words and well-formed English words. It counts words that are run together (Thecatsatonthe) as one word because it is only counting spaces, and it cannot deal with sentences ending in anything other than a full stop. When it encounters a question mark at the end of a sentence, the program simply waits until more words are input and it finally finds a full stop. Finally, if there are double spaces between words, as in the last test, the program counts this as two words present, one for each space (which, with the count of one for the last word, brings it to a total of eleven words for the last test).
When the same test sentences were run through the word count for MS Word (version 7.a), the word count could cope with question marks and double spaces, unlike the program presented here, but it still could not cope with nonsense words or words run together. So students can see that a solution to a problem on the computer may not be designed and implemented in the same way as a human being would solve a problem, as now the problem-solver has to take the feasibility of implementing the solution on a computer into account. It may also be that a less-than-perfect solution may be acceptable in certain circumstances when traded off against the difficulty or even impossibility of successfully implementing a much more complex solution.
The process of abstraction and representation is a process that is involved in writing simple programs, developing databases and developing large software packages. When students learn how to develop a relational database, they typically start with a case study in natural language:
A college runs various courses. Every course may have a number of students enrolled on it, or might not have any at all. Students must be enrolled on only one course. One or several tutors teach on each course. Each teacher is required to teach on only one course.
This model would now be transformed into a diagram as it begins the journey towards being precise enough to be implemented on a computer. As a first step, it would be necessary to look at the natural language form and think about precisely what it means. In particular, we might look at the nouns and their number (considering for example what is meant by 'every', 'a number of', 'only one', 'one or several'). These nouns (and information about them) may well form the basis for the eventual tables in the database. There may for example be a table called 'student' with columns storing data such as student numbers, names and addresses.
Then (simplifying somewhat) the verbs must be considered in order to represent the relationships between the nouns (or entities or eventual tables), in particular the forms that signify whether the relationship between certain nouns is obligatory or optional such as 'is required', 'must be enrolled', 'may have', 'might not have' and other less obvious forms). Students are enrolled on courses, for example, so there is a relationship of enrolment between students and courses and this relationship is optional as regards a course (i.e. it may not have any students enrolled on it). We would also be interested in synonyms such as 'tutor'/teacher' and 'tutor'/'teach' and whether they really mean the same or something different.
The model of the college, here expressed in natural language, is then transformed into a diagram (such as an Entity Relationship Model), and this diagram may be further transformed into a formal language model of the tables of the database, the relationships between them and the constraints as to what can be put into them. The formal language model of the database tables is then represented in a programming language and implemented on the computer. So the natural language model has been made consistent, precise and non-ambiguous in such a way that the computer can be told precisely what to do.
Computer Science degree courses are usually vocational and many graduates will be involved in software development. The software development process is therefore usually taught to undergraduates and it is a process which generates many written documents, several in natural language as opposed to formal languages or diagrammatic representations.
There are different methods of software development (cf. Britton & Doake (1996), Britton & Doake (2000)), but all methods will begin with finding out what the client wants the new software to do. Usually the client (or the software developers) will write these requirements for the software in a User Requirements Document (in natural language).
Using this document, system analysts will work with the client to clarify and make more precise the client's requirements for the system, including a description of the present computer or manual system that the new one will replace. This process may involve observation at the client's premises, interviewing of potential and existing users of the system, writing and distributingquestionnaires, reading documents and analysing data. As a result of all this activity, the system analyst will probably produce a Specification of Requirements document written in natural language. This expands, clarifies and gives details of the client's requirements, removing inconsistencies and ambiguities (bearing in mind the needs of the system designers to whom this document will be sent). This document is a more precise statement of what the system is required to do and its expected detailed behaviour. This stage of system analysis initiates the process of modelling the solution to the client's software needs and moving towards representing the solution in a form which can be coded and implemented on the computer.
The system designers may issue a first design document in natural language, which will again be a step or two further in terms of specifying systematic detail, removing inconsistencies and ambiguities and moving the design towards a point when it is precise and consistent enough to be coded in a programming language. Before this stage is reached the design will, as previously, be represented in diagrams and then in a formal notation. Simon & Simon (1993) and Britton & Doake (1996, 2000) provide interesting examples of documents, following part of a software system through from the User Requirements Document from the client, to the various stages of the design documentation, in increasing order of precision and consistency.
The process of writing the programming code and implementing the design on the computer also involves the programmers in writing natural language documentation for the program code, and in testing and debugging. User and reference manuals, on-line help and on-line tutorials may also be written. Once the system is up and working, it has to be maintained, and this will involve referring to and updating previous documentation.
Documentation of programming code will involve natural language inserts to the code, with a symbol at each end to signal to the computer to skip over the text between the symbols and not try to read it (as it is meant for humans only). Program documentation could be used for describing what the program does, or for explaining what parts of the code are for, or why it was written and designed in a particular way. Without this, it can often be very difficult and time-consuming for maintenance programmers to understand the functions performed by parts of the program code, especially crucial if they are trying to correct bugs in the system or put in new additions to the system.
The following is an example taken from the previous simple word counting program, documenting the purpose of a part of the code, and enclosed in curly brackets so it will not be read by the computer:
if character = ' ' then
begin
wordnumber := wordnumber + 1;
read(character)
end
{Increases the count of the number of words by one if the next
character is a space and then moves on to the next character.}
Another example consists of a tiny fragment of the code for a Mastermind computer game, with comments explaining the function of the code and giving reasons. The program chooses a random selection of 4 or 6 pegs for a game board. It draws one or more pegs as circles on the screen, filled in with a particular colour. Again the comments are in curly brackets:
{The pen will be used to draw the peg's border, so the pen colour is
set to black:}
PlayingArea.Canvas.Pen.Color := c1Black;
{The pen will be used to colour in the drawing, so the brush colour
is set to the colour of the peg:}
PlayingArea.Canvas.Brush.Color:= BrushColorFromPeg (p);
Given this background to what may be required of computer science students in the commercial world in terms of writing, we now turn to what students may need to write on an undergraduate computer science course as preparation for this.
It is not easy to provide work with real clients, though it may sometimes be available for final year students. Other students usually have to write a Requirements Specification document on the basis of a provided case study instead of information from the client. This document will be in natural language structured under headings. Simulations may be used for interviewing the 'client' to find out more precise detail and so on. Here are extracts from such a possible case study:
Mrs. Green runs a "gorilla-gram" agency. A customer will ring in with a message for the gorilla to deliver at a particular date and time. These are recorded on pages in a ring-binder..........At the end of the week she calculates how much she has to pay each gorilla (they are paid £10 per call)............"
Students may also need to write a design document in natural language and also write their reasons for design decisions. User guides and/or on-line help systems may also be written, giving instructions and explanations to users. Students will also need to provide program documentation, giving program comments (see previous examples) and reports of testing methods and results.
Reflective essays are often asked for and may be short or long. Students may be asked to reflect on practical workbased on a case study, for example, by giving reasons for their design decisions and evaluating the result in terms of its real world effectiveness and usefulness. Other essays may involve discussion of different views or methods in a practical context e.g. discussing how different views of software quality might apply to a particular situation. Many students tend to find precise and consistent evaluation and discussion quite challenging, especially as they need to provide evidence to back up their views.
Students may also be required to write reports (and give oral presentations). For instance,a second year business IT course may give out a case study gives details of a small company wanting to expand their business of producing educational software. Students might be in groups of six, each student representing a department in the company and writing a report for that department. On the basis of these reports, the group will then develop a systematic company expansion strategy and give a team oral presentation making their case to the Board (with supporting documentation).
Examination questions are often quite short (though some can be longer and in essay form). While many short questions may require students to produce code or to work out a formal solution to a problem, others will require short answers in natural language.
A common type of question involves translation from one form of representation to another:
e.g. explain in simple English what the following predicate means:
dom passwordOf = allUsers
Answer: Anyone who has a password must be a user and all users must have a password.
Other short questions, expecting short answers, involve statements followed by a question, such as:
e.g. Some systems map directly from a text name to a hardware address. What would be the limitations of this in an internet?
Other questions might ask students to compare and contrast, explain, illustrate, demonstrate, make a case for, suggest why or identify and describe e.g.
Explain what is meant by an interface definition language.
There may be some questions asking students to produce longer reflective essays
e.g.
a response to a case study on computer ethics (which may be given out in advance)
or
Give guidelines for designing a series of screens in order to facilitate user navigation through the screens. Justify your choice of guidelines.
Final year undergraduates will generally need to research and write a project report, though institutions will vary as to the nature of the project. Most probably the student will be required to solve a practical problem from a case study or client and evaluate the solution and the methods used. A practical project will involve designing, developing, implementing and testing a piece of software and then writing a reasonably substantial report. The student may be required to aim this report at a reader who has, for example, completed the same courses as the writer, but is unfamiliar with the project. The report is generally of the classic problem-solution form and is often, though not necessarily, structured as follows:
Abstract
Table of Contents
Introduction
Main Chapters
including review of literature
Discussion and Evaluation
Bibliography (often using the Harvard System for citing paper sources, and some particular method for citing WWW sources)
Appendices
design documents
program code
testing method, analysis and results
Markers may look for evidence that the student
Structure, style and use of language are important, especially in their role of contributing to the provision of clearly comprehensible evidence for the points listed above.
The following is an example of a possible problem a student might undertake. A student doing this project would need to find a software solution and then write a project report (including the code for the software, and all the diagrams and formal language representations generated on the way to the code, in a series of appendices):
|
Title Timetabling Lectures, Lecturers and Rooms Topic Data Analysis, Database Design Areas User Interface Design, Programming Aim: To design and implement a database system to assist in the timetabling of lectures in an academic institution. Objectives:
Notes: student may wish to use the Computer Science Departmental Timetabler as a customer. |
Computer science writing tends to take place in a professional situation where there is a need to define and clarify a problem and then design, implement, document, and test a solution. At various stages, the computer scientists involved will need to communicate with both users and other computer scientists. Undergraduate courses will usually aim to prepare students in some way for these professional requirement. Students will not only need to produce code, but to use language precisely and consistently to analyse the problem and initiate and evaluate a solution. They will also need to learn to explain technical information to other computing professionals as well as to those who are generally not computer scientists, but clients or software users. So students need to learn how to use language to produce professional documentation, and also to write for academic purposes such as producing essays and project reports.
Britton , C. & Doake, J. (1996) Software System Development: a gentle introduction. 2nd. ed. London. McGraw-Hill.
Britton , C. & Doake, J. (2000) Object-Oriented Software System Development: a gentle introduction. London. McGraw-Hill.
Open University (1988).Fundamentals of Computing. Milton Keynes. Open University Press.
Simon A.R. & Simon J. (1993) The Computer Professionals Guide to Effective Communications. New York. McGraw-Hill.