FANDOM


The Obscure Bits They Don't Teach You In COMP1911Edit

The 'const' and 'volatile' keywordsEdit

Declaring a variable 'const' means that once it has been assigned to for the first time, it can never be assigned to again. Often used for pointer (espcially string) arguments to functions, so that you can't accidentally change the argument variable.

Declaring a variable 'volatile' means that the compiler will ensure it is always re-read from memory every time it is referenced, instead of optimising it to be a register or a constant. If the variable could change at any time independantly of your program, you should declare it volatile. Two reasons you might want to do this are:

  • The 'variable' is actually an IO port, not a memory location.
  • The variable is shared between several processes or threads in a multi-threaded program, and so could be changed by another thread at any time.

The In-line If StatementEdit

variable = condition ? expression1 : expression 2;

is the same as

if (condition) {
    b = expression1;
} else {
    b = expression2;
}

EnumsEdit

A strange cross between a struct and a hash-define.

enum colours {
    RED,
    BLUE,
    GREEN,
    ...
};
 
int paintjob = colours.RED;

Function PointersEdit

Function pointers are, well pointers to functions. Basically they allow you to pass a function as an argument to another function, so you can abstract out general routines, avoiding copy-paste coding, and generally do lots of really hacky stuff.

You declare a function pointer like any other pointer using an asterix, however you need to use brackets to make it clear that it's the function and not the return value.

// a pointer to a function that takes two ints and returns an int
int (*add)(int a, int b)
 
// an array of pointers to functions that take an integer array and a double and return a double
double (*f[2])(int x[], double t)


Once you've declared it and assigned it, you can treat a function pointer exactly like an ordinary function:

#define INTERVAL 0.01
 
double parabola (double x) {
    return x*x;
}
 
double hyperbola (double x) {
    return 1/x;
}
 
void plot (double (*f)(double x), double start, double end) {
    double x;
    double y;
    for (x = start; x < end; x += INTERVAL) {
        y = f(x);
        draw_point(x,y);
    }
}
 
void do_stuff(int flag) {
    double (*curve)(double x);
    if (flag) {
        curve = parabola;
    } else {
        curve = hyperbola;
    }
    plot(curve,-2.0,2.0);
 
}


Returning a function pointer from a function is pretty ugly. You wrap the whole function declaration in brackets and put the function pointer declaration around it:

// this function takes a string 'arg' and returns a pointer to a function that
// takes an int 'c' and returns an int
int (* get_character_test(char *arg))(int c)


Extra credit: What the hell is this:

char (*(*x())[])();


Answer: a function returning a pointer to an array of pointers to functions returning char.

The Proper Way To Read Console InputEdit

If you're lazy, you use 'scanf'. If you're masochistic, you use 'getchar'. If you're a civil or mechanical engineer, you use 'gets' and then wonder why it won't compile as Basic in Visual Studio.

But 'scanf' has a really big problem: it handles newlines in a retarded way, and leaves them lying around in the input buffer, often causing you to think you're reading empty lines.

The best, safest way to read console input is to read a whole line into a buffer using 'fgets', and use 'sscanf' to parse the string in the buffer using scanf-like syntax. It looks like this:

#include <stdio.h>
...
char input[MAX_LINE_LENGTH];
fgets(input, MAX_LINE_LENGTH, stdin);
sscanf(input, "some scanf format %d %s", &arg1, &arg2);

In real life you should also be checking for expected return values. Really.

scanf Format StringsEdit

Firstly, always include a field width for string formats, e.g. "%15s" if you have a 16 char buffer.

Did you know scanf has a regular-expression-like format identifier? Try "%8[0-9]" if you want to read a string of (at most) eight numbers.

Another little-known format identifier is "%n", which stores the number of chars processed so far - in effect, the string index at that point.

A Note On Console OutputEdit

Console output with 'printf' or 'puts' only updates on the screen when a newline is printed. If you do printf("prompt> "), it's entirely possible that it won't show up at all until you either print a newline or call 'fflush(stdout)' to force an update.

Abusing The Compiler Edit

String Literal Concatenation Edit

The compiler will automatically concatenate two string literals that are next to each other. This is useful for splitting up really long printf arguments, or reducing multiple printf calls to a single call.

// this is terrible to read
printf("Old Gregg: Maybe I could help you. I got The Funk.\nHoward Moon: Yes, you're very funky Gregg.\nOld Gregg: No no, you don't understand. I mean, I got The Funk right here; it's in this box!\n");
 
// this is slightly better, but still a little annoying
printf("Old Gregg: Maybe I could help you. I got The Funk.\n");
printf("Howard Moon: Yes, you're very funky Gregg.\n");
printf("Old Gregg: No no, you don't understand. I mean, I got The Funk right here; it's in this box!\n");
 
// this is using string literal concatenation
printf("Old Gregg: Maybe I could help you. I got The Funk.\n"
       "Howard Moon: Yes, you're very funky Gregg.\n"
       "Old Gregg: No no, you don't understand. I mean, I got The Funk right here; it's in this box!\n");

That's a neat formatting trick, but doesn't really add any new functionality. However, we can also do this with constants, allowing us to use a hash-defined printf and scanf format all throughout our code. This saves us from copy-pasting hard-coded scanf and printf formats.

#define USERNAME_SCANF_FORMAT "%13s"
...
scanf("My name is " USERNAME_SCANF_FORMAT "\n",&buffer);

Writing Portable, 64-bit Compatible CodeEdit

Compiling your program for 64-bit is actually porting it to a different processor architecture like ARM. There are a few difficulties here: you can no longer make any assumptions about the size of data types (most obviously pointers), and you can no longer make any assumptions about endianness.

Endianness Edit

Endianness refers to the order in which bytes/words are stored: a big-endian platform stores the most significant bit first, a little-endian platform stores the least significant bit first. Intel chips are little-endian. Motorola chips are big-endian. Many modern chips, such as ARM and SPARC, are 'bi-endian' (they swing both ways, heh) and can be configured to be either big- or little-endian.

The word 'endianness' comes from Gulliver's Travels; the original essay on byte order that introduced the term is called 'On Holy Wars and A Plea For Peace', google it.

A binary file format should define endianness, e.g. FLAC audio files store all numbers in big-endian format. Anything sent over the network should be in big-endian format. Use the C library functions 'htons', 'htonl', 'ntohs', and 'ntohl' to convert to and from network byte order.

Data Type Sizes Edit

You can't assume that different compilers on different platforms will have the same size longs, or produce structs of the same size. For example, the TELE3118 project from S2 2009 defined the format of a server registration message as

typedef struct {
    unsigned long nusers; /* number of users (no more than 50) */
    struct UserInfo_s {
        char username[14];
        unsigned short tcpPort;
        unsigned long ipAddr;
    } user[50]; /* info about each user */
} RegRespMsg_t;

This struct is sent across the network to a remote host, by passing a pointer to it to a write call on a socket. However, what if the sending host is x86 (32-bit) and the receiving host is x86_64 (64-bit)? Let's run a short program using 'sizeof' on a 64-bit Linux machine and a 32-bit Windows machine, and look at the differences.

64-bit:

samwise@hex in OtherPrograms $ ./sizes
----- SIZES -----
int             = 4 bytes
char            = 1 bytes
short           = 2 bytes
unsigned        = 4 bytes
long            = 8 bytes
long long       = 8 bytes
float           = 4 bytes
double          = 8 bytes

32-bit:

$ ./sizes.exe
----- SIZES -----
int             = 4 bytes
char            = 1 bytes
short           = 2 bytes
unsigned        = 4 bytes
long            = 4 bytes
long long       = 8 bytes
float           = 4 bytes
double          = 8 bytes

It's mostly the same, but importantly, a 'long' is 8 bytes on the first machine but only 4 bytes on the second. The RegRespMsg_t struct contains unsigned longs. So if the first machine reads a struct that was written by the second, it will get weird underflows and corrupted data.

The proper solution is to use a serialised data format or libary, like Java's 'java.io.Serializable', Python's 'pickle' module, or Ruby's 'YAML' thing. You could use XML, although I'd personally rather have CAT5 cable inserted up my arse than parse XML in C.

The quick-and-dirty-solution is to use explicit 16- and 32-bit data types like 'u_int32_t':

#include <sys/types.h>
...
typedef struct {
    u_int32_t nusers; /* number of users (no more than 50) */
    struct UserInfo_s {
        char username[14];
        u_int16_t tcpPort;
        u_int32_t ipAddr;
    } user[50]; /* info about each user */
} RegRespMsg_t;

And this is in fact what I did to fix the example assignment code we were given.

Useful Libraries Edit

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.