The assignment this week does not count towards your credit for the unit. The main purpose is to learn to write functions which involve arrays, strings and loops. You may need information from the lecture notes up to the strings chapter.
You are asked to write your own versions of some of the string functions in the standard library. Each function should be implemented directly, without calling any library functions. (The functions can call each other.) The declarations of the functions you are asked to write are:
int length(const char s[]); void copy(char t[], const char s[]); int compare(const char s[], const char t[]); void append(char t[], const char s[]); int find(const char t[], const char s[]);
These are intended to be custom versions of strlen
,
strcpy
, strcmp
, strcat
and
strstr
.The tests will provide the detailed specifications of the functions.
You are given this skeleton program and Makefile as a starting point:
Here are some notes:
Type:
make ./strings
Array notation is used rather than pointer notation. You don't need to use
any pointer notation in your answers. (It is traditional to use pointer
notation in functions like this, but array notation is arguably superior,
because it is more readable and can actually be more efficient nowadays, because
the compiler with the -O2
option has more information about what
you mean so it can optimise better.)
The program doesn't do anything, except run tests on the functions. This is
what you would do if you were developing a library module. The only additions
for a real library module would be (a) you would add a header
strings.h
, like the standard one, for users to include and (b) you
would rename main
to avoid a clash with a program that used your
library. You will see that in detail in later assignments.
Read the tests to find the detailed specification of what each function has
to achieve. Note that, for the compare
function, you only need to
return a negative, zero or positive number, not necessarily -1, 0, +1.
Historically, that's for efficiency reasons because you would have to add an
extra unnecessary step to
normalise. A couple of the original
functions unnecessarily return the result string - the custom versions don't,
because it is not possible using only array notation. The find
function returns an int
position rather than a substring for the
same reason.
For example, the return type of length
is int
,
rather than size_t
. That means it won't work for super-long
strings. However, for a program which handles such strings, you would
probably want to design a custom string type anyway.
const
keyword is usedThe const
keyword is optional. However, it has advantages.
For one thing, it helps you to implement the functions, because the compiler
will detect any mistaken attempt to change a string which you are not supposed
to change. Also, it helps the compiler to optimise the code.
As usual, it is a case of working in small steps, one test at a time, keeping your program in a working state. Here are some things to think about:
The assignment is about writing loops. Obviously, calling
strlen
to make length
easy isn't allowed. As well as
the tests in the skeleton, some further tests will be made on your program while
marking to try to ensure that you haven't used any library functions.
The efficiency of your functions won't be tested in the marking. But efficiency is tradionally a big concern in writing reusable library functions. Don't agonize over efficiency, especially if it is at the expense of readability or correctness. I suggest you just give a little thought to how many times each function loops through the relevant characters. (If you are fascinated by efficiency, you could do timings or look at the compiled code in assembly language format, but that is definitely not required.)
Suppose you write a loop for (int i=0; i<length(s); ...)...
.
You might think that the function length
is called each time round
the loop. Since length
itself involves a loop, that would be
inefficient. But if s
is declared const
and you are
using the -O2
option to compile, the compiler will notice
that length(s)
only needs to be called once.
If you write for (int i=0; i<length(s); ...)...
then there
are two loops through the characters, one in the length function, and the one
you've just written. You may be able to get away with one loop by
writing for (int i=0; s[i] != '\0'; ...)...
.
If you write for (int i=0;...)...
there may be one more thing
you need to do, at the end of the string. But the variable i
is no
longer available, because it is local to the loop. If instead you write
int i; ... for (i=0;...)...
then i
is available after
the loop ends, and may have a useful value in it.
If you have time, and if you are interested, you can write another C program. Write any program you want, of the same sort of scope as the string program, involving arrays and loops.
Submit your source file, plus a readme.txt
to explain what you
have done, or if you want to catch my attention.
The tests in the string function count for 50% of the assignment. The
filename must be strings.c
precisely. Marks will be taken off if the
filename is wrong, or library function calls are detected. The other 50% comes
from a quick look at any extension program you submit.