Compilers

Choices

The recommendations made for this unit are:

So a typical compile command is:

clang -std=c11 -Wall -pedantic -g -fsanitize=undefined -fsanitize=address eg.c -o eg

This is long enough to be worth using make

Available compilers

The compilers worth talking about are:

MSVC

Windows provides MSVC, e.g. via Visual Studio

But its emphasis is on C++, and it meets only the C90 'standalone' standard for C (meant for embedded processors), it makes a lot of Windows-only assumptions, it promotes Windows-only libraries, and so your chances of producing a portable program are excessively low

If you are addicted to the Visual Studio user interface, there is a version of clang which is bundled with VS as its IDE - but you still need to beware of various language and library pitfalls

GCC

gcc is old, but reliable and works everywhere

It is almost completely compatible with clang, accepting all the same options

The main problem is, on Linux or Mac, it is a built-in part of the operating system, so it is excessively difficult to upgrade to a newer version

And a very common version (in the lab and on Codio for example) is 4.8, but you need 4.9 at least to use the advanced debugging flags (and 8.2 is current!)

The simplest way round the problem is to use clang

Clang

clang is newer than gcc and better designed

The most important advantages are (a) it produces better error and warning messages and (b) it is more likely to support the advanced debugging options (because of the version problem with gcc)

There are other technical advantages which won't matter in this course

The only disadvantage I know is that, on Windows, you need to change -o program to -o program.exe

Installing

On Linux, use gcc if it is V4.9 or better, or install clang with sudo apt install clang

On a Mac, clang is installed (with synonym gcc) but you should install homebrew to get the other command line tools you need (and a package manager)

On Windows, install Linux as dual boot if you can - if not, install Cygwin, use its setup program as a package manager, and install gcc and/or clang (beware things will get difficult when we reach pointers and graphics)

Which standard?

There are four choices:

  1. don't bother about standards
  2. use the 1990 standard (C90)
  3. use the 1999 standard (C99)
  4. use the 2011 standard (C11)

C90 is also called C89 or ANSI C, but those are local USA names, not the international name

Don't bother?

The problem with the "don't bother" option is that your program will only work on your computer

If you are a student and you do an assignment in C on Windows, and it gets marked on Linux, or vice versa, it probably won't even compile, so you get 0%

And if you are working in industry, you should aim to produce cross-platform programs (or at least write in a cross-platform style, then adjust for your platform)

Going beyond the standards

You can go beyond the standards, but it is a very specialised thing to do - doing it professionally involves:

So for ordinary, everyday purposes, it is really out of the question: you need to stick to a standard

1990, 1999 or 2011?

The 1990 standard is ancient, missing out some recent language developments, so advice from the web often doesn't work - that's not good

The 1999 standard is more-or-less OK, just obsolete since 2011

The 2011 standard is the standard for C, adding (e.g.) threads (not used in this unit) and improving compatibility with C++

The C11 standard says it cancels and replaces the previous standards, so technically it is the only one!

Reluctance

The 2011 standard is the right one to use

There is a reluctance among a lot of C programmers to move to the 2011 standard, or even the 1999 standard

That appears to be for two reasons

Single platforms

One reason for reluctance to use C11 is that a lot of programmers only write programs for one platform

And they use the native compiler available on the platform, even though it doesn't meet the standards

(The most obvious platform like this is Windows, but it isn't the only one)

That's useless for this unit

Misreading the standard

An example of the second reason is that Wikipedia says: due to delayed availability of conforming C99 implementations, C11 makes certain features optional

This is a complete misunderstanding: all the standards have had optional features, to allow 'standalone' versions of C on embedded processors, it is just that the C11 standard makes them more explicit - in particular the new standard for threads does not apply to small embedded uniprocessors

The -std=c11 option

The option you need to give the gcc or clang compiler is -std=c11 to switch on the 2011 standard

(You would use -ansi or -std=c89 or -std=c90 for the 1990 standard, or -std=c99 for the 1999 standard)

The -Wall option

Options starting with -W (capital) relate to warnings

The -Wall option tells the compiler to switch on 'all' warnings (actually all common warnings)

The more problems which are detected as warnings when you compile, the less endless runtime debugging you have to do - so you should switch them all on

The -pedantic option

This switches on warnings to do with extensions to the standard

This option shouldn't be important, but every year one or two students get into trouble through stumbling on non-standard gcc extensions which don't work when the program is compiled for marking (e.g. having a function inside a function)

So it is recommended for this unit

Messages

Compiler error and warning messages can be irritating, but you must learn to interpret them - always ask if you can't work them out

And you must make your programs free of warnings, because otherwise:

Interpreting messages

It is difficult to give good advice on interpreting error and warning messages, but here are 2 pieces of advice:

If there are a lot of messages, ignore all but the first

Suppose the message points out a problem on line 15, and you can't see what it is complaining about

The most common reason is that you have left out a semicolon at the end of line 14

The problem can't be after the place the message points to, but it can be before, maybe a long way before

The -g option

The -g option asks for debugging info to be added

The debugging tool gdb can then be used

The program potentially runs very slowly so, once the program works, you can replace -g with -O2

That's a capital letter O, not a zero 0, plus an optimization level (-O2 is the highest safe level, -O3 means 'add aggressive less safe optimizations')

Then the program compiles slower and runs faster

-fsanitize=undefined

The -fsanitize=undefined advanced debugging option adds extra code to your program to check for 'undefined behaviour'

Undefined behaviour is where the program is illegal but C can't normally detect it

The standard says in that case that anything can happen, and the program typically runs on past the error then crashes or gives wrong results, without giving any hint of where the bug is

Undefined example

Suppose you have an array int a[10] and you index it with a variable n which is too big

You may access memory which doesn't belong to your program, so the program crashes

Or you may access some other variable in your program, producing stupid results

With -fsanitize=undefined the program stops instantly and prints a suitable message

-fsanitize=address

The -fsanitize=address advanced debugging option adds extra code to your program to check memory access

With int a[10], the compiler knows how big the array is and can add safety checks

But if an array is allocated dynamically (using malloc) the compiler doesn't know the size

The -fsanitize=address option adds library and program code to cover this case by tracking allocated lumps of memory

Windows

The advanced debugging options aren't available on Windows (not even under Cygwin or Bash using clang)

One of the two can be made to half-work with:

-fsanitize=undefined -fsanitize-undefined-trap-on-error

The program stops on an 'undefined' error, there is no library to form an error message, but gdb can be used in a minimal way to check the error

If this situation improves, let me know

Using make

Using all the options leads to a long command

It would be pretty stupid to type it over and over again, because it is time consuming and error prone

Use arrow keys to find previous commands - then you should only have to type the command once per session

Use aliases or shell scripts or build tools so you don't even have to do that

Of these, make is recommended and a Makefile will be provided for each assignment

Libraries

Normally, we expect you to use the standard C library, with headers such as stdio, stdlib, stdbool, ...

In addition, for some assignments, we may expect you to use a non-standard but portable library such as SDL - it will have to be installed separately for each platform you use

The Posix standard

In Linux and other Unix-based systems, many incompatible non-standard libraries arose for a variety of ordinary practical purposes

To bring order to the chaos, the Posix standard was created

It is an intermediate library - a 'standard non-standard' library

We allow it in this unit, since MacOS and Windows can support it

Using Posix

Suppose you want to write an animated program using terminal I/O

This is not really recommended other than for fun, because it is unlikely to work reliably across platforms, but it is OK in this unit

For that, you need a function like usleep or nanosleep for short animation pauses

These are not in the C standard libraries, but they are in the Posix standard

Posix problems

To use a Posix function, include a header, e.g.

#include <unistd.h>

But this may not work, because with -std=c11 the unistd header only includes the functions which happen to be in the C standard

You have to make an extra declaration to make it clear that you are relying on the Posix standard and, in particular, which version

Posix declarations

To use Posix with C, either add a #define before the #include in the source:

#define _POSIX_C_SOURCE 200809L
#include <unistd.h>

or put the equivalent declaration in the compile command:

gcc -D_POSIX_C_SOURCE=200809L ...