Introduction to C

Lab 5: Introduction to C

1. Overview of Laboratory

The objective of this laboratory is to learn how to write, compile, and execute simple C programs in vscode. The sequence of steps to be performed are:

  1. Configuration of repo and vscode
  2. Create and run a vacuous program
  3. Extend (slightly) your vacuous program
  4. Write and execute a “real” program

What to turn in

You will perform all your work in your lab repo. You should submit a single document “report.md”. The document should include:

  1. A brief description of the lab
  2. Any issues that arose
  3. A explanation of each of the programs that you wrote. This should not be a verbatim “paste” but should
    • Describe the requirements and any assumptions for the program
    • Describe the program structure
    • Provide key code fragments and a description of what the fragments do and how they do it.
    • You don’t need to explain hello.c

The grading rubric for this lab is

  • Completion of tasks: 60%
    • Vacuous program : 5%
    • Word count : 20%
    • Makefile : 5%
    • Two module program: 30%
  • Lab report: 40%
    • Overview: 5%
    • Steps taken + issues: 15%
    • Discussion of your code and example output: 20%

2. Configuration

  1. Refresh the lab-instructions submodule (see lab 2 for directions)
  2. Copy lab-instructions/labs/lab5 to labs/lab5
  3. Change to the repo directory labs/lab5 and start vscode

    $ cd labs/lab5
    $ code .
  4. Check the extensions in your vscode instance to make sure that the C/C++ extension is installed. If not, install it.

The lab5 directory contains a “hidden” directory .vsode – you can see this if you execute ls -a from the terminal. Within this directory are two vscode configuration files

  • .vscode/tasks.json
  • .vscode/launch.json

The tasks file defines “build” tasks that help automate the edit-compile cycle. In each step of this lab, you will try compiling both from the command line and with the build task.

The launch.json provides configurations to help launch a debugger which we will not use in this lab.

3. Writing, compiling, and running a “vacuous” program

  1. Create a new file “hello.c”
  2. Type or cut/paste the following in your file:
1
2
3
4
5
     #include <stdio.h>
  
     int main() {
         printf("hello world\n");
     }
  1. From the Terminal menu, execute “Run Build Task” (⇧⌘B) and select the “C/C++ Clang build active file” option. This should open a terminal within vscode and create a file called “hello”
  2. Open a new vscode terminal (Terminal -> New Terminal)
  3. In that terminal, execute

    $ ./hello

    For simple (single module) programs, that’s all that is needed to compile and execute.

  4. You can also compile and run your program at the terminal command line:

    $ clang -o hello hello.c
    $ ./hello
  5. Now extend your hello.c program in hello-tick.c by adding a loop that (forever)

    • Prints “hello world”
    • Waits 3 seconds (use the sleep function – man 3 sleep –, you’ll need to add, “#include ” to your code
  6. Build and execute your program. You can force the program to terminate by executing ctl-c in the terminal window.

4. Writing and Testing a “Real” program.

Next you will write your own version of the Unix wc (word count) program. To see what wc does, make a plain-text file called haiku containing this text:

Ancient pond
Frog leaps
Splash!  

Here’s how to run wc on it (from th terminal)

$ wc < haiku 
 3  5 32

The results mean: 3 lines, 5 words, and 32 characters (bytes). The < means to hook up wc’s standard input to the haiku file. Alternatively, you can execute it as

$ cat haiku | wc

From your bash tutorial, you may recall that cat haiku prints the contents of haiku to the “standard output” (stdout) and ‘|’ “pipes” the stdout from the program on the left of the pipe to the stdin of the program on the left.

You can find out more about wc by typing man wc. In your version, though, don’t bother implementing any command-line options or reading the names of files from the command line. Just read from the standard input—called stdin in Unix.

There is a a skeleton file wc.c. When you compile it into an executable file called mywc, you should be able to run it like this:

$ ./mywc < haiku 
 3  5 32

Reading from stdin

Here is a common C idiom for reading all characters from the standard input:

1
2
3
4
5
int c; /* current character */
    ...
while ((c = getchar()) != EOF) {
  /* do something with c */
}
A few notes about this idiom:

  • c is declared as an int, not a char.
  • The syntax inside the while condition means “call getchar(), assign the result to c, and execute the code between the { } only if the result was not EOF.” The syntax is terse, but you should make sure you understand it.
  • EOF is not a character. It’s a special number, defined in stdio.h, guaranteed to not be equal to any number that can possibly stand for a character. It accomplishes that by taking up more than 8 bits (characters take only 8 bits). This is why c is declared as an int rather than a char.

Whitespace How do you tell where one word ends and another begins? The rule is: words are separated by “whitespace”. Whitespace is traditionally defined to be any of these six characters:

name representation in C what it looks like
space ’ ‘
tab ‘\t’
carriage return ‘\r’
line feed (newline) ‘\n’
form feed ‘\f’
vertical tab ‘\v’

(Of course none look like anything, because they’re whitespace.)

It’s legitimate to compare an int with a character. So, for example, you can write:

1
2
3
if (c == '\t') {
  ...
}

Alternatively, you may use the macro isblank(c) to test if a character c is a whitespace character. Read about this macro

man isblank

Notice that you will need to “include” <ctype.h> if you wish to use it.

Printing numbers in C

Lastly, here is the usual way to print a number in C. You call the standard library function called printf, as in this tiny program:

1
2
3
4
5
6
7
#include <stdio.h>

int main() {
  int n = 17;
  printf("A haiku consists of %d characters in Japanese.\n", n);
  printf("In hexadecimal, that would be %x.\n", n);
}

The %d or %x gets replaced by the second argument to printf. The first argument is called the format string.

printf can do a lot, including print many numbers within the same format string, and print in different formats. Type man printf to see some thorough but very terse documentation.

Creating a Makefile

The principle of independent compilation was introduced in class, along with features of the C language that support this, and the make command which automates the building of programs in a manner that minimizes redundant work after part of the program has been changed. Some of the main points are:

  • make reads a file called a Makefile, which tells which files are made from which other files (for example, .o files are usually made from .c files), and what commands need to run to perform the compilation. Then it looks at the dates on the actual files to see which are out of date (or missing), and runs all the commands that need to be run.

  • A file to generate is called a target. A file or target that is needed in order to generate a target is called a dependency.

  • The following Makefile tells how to compile your wc assignment. Recall the targets are at the start of a line. Following the colon is a list of dependencies. The next line is the command to generate the target. Study this file for a couple minutes to get a feel for what it can do and the strange syntax of makefiles.

    CC = gcc
    CFLAGS = -g
    LDFLAGS = -g
    
    mywc: mywc.o
    $(CC) $(LDFLAGS) -o mywc mywc.o
    
    mywc.o: mywc.c
    $(CC) $(CFLAGS) -c mywc.c
    

As you can see, .o files depend on .c files, and executable files depend on .o files.

The -c option in some of the compilation lines tell the compiler to produce a linkable object file (with .o suffix), rather than an executable file.

Here’s how to run it:

  1. TRANSCRIBE (re-enter) the above into a file called Makefile in the same directory as the .c files.
  2. In the same directory, type

    into the shell. You are STRONGLY ENCOURAGED NOT to "copy-and-paste" the code from this file. See paragraph after next.
    
    Notice that if you execute

    shell

$ make mywc


twice, it doesn't do anything the second time. It only performs the steps that need performing.

**GOTCHA:** The command lines in a Makefile must be indented with a single tab. Spaces won't do, even though they may look the same in a text editor. In its default setup, emacs displays tabs as ^T. With your file named as Makefile, emacs will recognize that a tab should be inserted when you press tab even if typically tabs are translated into spaces for your source files.

## 5. Creating a Two Module Program

In this section you will create a basic two module (two c file) program.  Compiling this in vscode is somewhat more complicated than for hello.c because of the need to add a build rule.
We'll start by creating a *skeleton* of the program you will ultimately create and get that to compile and execute.

You will be creating your own (very restricted) version of printf.   Create a file ```myprintf.c```  and a corresponding header file ```myprintf.h```

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <stdarg.h>
#include <stdio.h>

static void printint(int i) {

}

static void printstring(char * s) {

}

static void printhex(unsigned int h) {

}

void myprintf(const char *fmt, ...) {
    const char *p;
    va_list argp;
    int i;
    char *s;

    va_start(argp, fmt);

    for (p = fmt; *p != '\0'; p++) {
        if (*p != '%') {
            putchar(*p);
            continue;
        }
        switch (*++p) {
        case 'c':
            i = va_arg(argp, int);
            putchar(i);
            break;

        case 'd':
            i = va_arg(argp, int);
            printint(i);
            break;

        case 's':
            s = va_arg(argp, char *);
            printstring(s);
            break;

        case 'x':
            i = va_arg(argp, int);
            printhex(i);
            break;

        case '%':
            putchar('%');
            break;
        }
    }
    va_end(argp);
}
Create another file `test.c`
1
2
3
4
5
6
# include "myprintf.h"

int main()
{
    //... test code ...
}
You can compile these into a single program from a the vscode terminal as

shell $ clang -g test.c myprintf.c -o test


Once everything compiles, create a **Makefile** -- call it **Makefile-printf.mk**.  You can execute this as:

shell $ make -f Makefile-printf

Now, add and test the three missing functions `printint(), printstring(), printhex()` one at a time.  The three missing functions  should call `putchar()`, one character at a time, to produce the desired output. Here are some sample calls to myprintf and what they should print:


| call	 | should print |
| --- | --- |
| myprintf("Nothing much\n"); |	Nothing much |
| myprintf("The letter %c\n", 'A');	| The letter A |
| myprintf("A string: %s\n", "Splash!");| 	A string: Splash! |
| myprintf("The number %d\n", 11);	 | The number 11 |
| myprintf("The number %x in hexadecimal\n", 11); |	The number b in hexadecimal |
| myprintf("%d is a negative number\n", -5); |	-5 is a negative number |
| myprintf("The number %d\n", 'A');	 | The number 65 | 
| myprintf("The number %x in hexadecimal\n", 'A'); | 	The number 41 in hexadecimal |

I recommend that you build your program iteratively -- get one function working and tested before moving on to the next. The easiest place to start is `printstring()` since that only requires looping over the string.

The following should help with the other two parts.


**ASCII codes**

There is a difference between a digit, and the ASCII code for a digit. An ASCII code is a number that represents either a printable character (a letter, digit, or punctuation mark) or some simple action to take at the output device that is printing characters (like "newline" to start a new line of text, or "bell" to make a beep). Most hardware that deals with characters, such as printers and keyboards, responds to or generate ASCII codes.

On the web it is easy to find a chart of all the ASCII codes. Notice that the letter A has the ASCII code 65. When a program sends the ASCII code 65 to your shell or terminal window, it appears as the letter A. When you press the capital A on your keyboard, it sends the code 65 to the CPU. The digit 0, perhaps somewhat confusingly, has the ASCII code 48. If you print the byte 48 to the shell window, it will appear as a 0.

Notice that the ASCII code of each digit is the digit plus 48. So, if digit contains a decimal digit (in the range 0 to 9) that you'd like to print, then:

c ascii_digit = digit + 48;


sets ascii_digit to the ASCII code that represents digit. An equivalent, somewhat clearer way to write that in C is:

c ascii_digit = digit + ‘0’;


The C compiler understands a character in single quotes, like '0', to mean the ASCII code of that character, represented as an 8-bit value of type char. Smaller integers are automatically converted to larger ones, so '0' is just another way of writing 48 in an expression involving values of type int. Both end up as the same bit pattern inside the computer.

A two-digit number needs two ASCII codes in a row, so (in decimal), the number 24 would come out as ASCII code 52 followed by 54 (a '2' followed by a '4').

Essentially, the job of your code that implements %d and %x is to convert a number into a sequence of ASCII codes that print out as the (decimal or hexadecimal) representation of that number.

**% and /**

The C language provides a handy operator for "peeling off" the last digit from an integer: the modulus operator, indicated by the % sign. The modulus operator returns the remainder after dividing its first operand by its second operand. So, a % b is the remainder after dividing a by b. This is handy for peeling off the last digit of a number, because taking the remainder after dividing by 10 gives you the last digit. For example:

c digit = n % 10;

In C, when you divide one integer by another, using the / operator, you get the quotient and the remainder is thrown away. So, for example, 11 / 4 will result in 2, since 4 goes into 11 twice (leaving a remainder of 3, which is thrown away). (This only applies to integers. When you divide floating-point numbers you get another floating-point number: there is no quotient and remainder in floating-point arithmetic.)

This makes / handy for removing the last digit from a number. This statement:

c n = n / 10;

divides n by 10 and throws away the remainder. So, in effect, it removes the last (decimal) digit from n and moves all the remaining digits (if any) one position to the right.

**A recursive solution**

The only remaining difficulty is that you need to produce digits in from high to low. (For this exercise assume only positive integers need to be handled) One way to do this is recursively -- the top level call to `printint(num)`  calls `printint(num/10)` if num is greater than 10
and then prints `num%10`.   

**Hexadecimal** 

Hexadecimal, commonly called "hex", is simply base 16, as opposed to the base 10 ("decimal") that we normally use. The rightmost digit is the 1's digit, the next digit is the 16's digit, the next one is the 256's digit, and so on (as opposed to 1's, 10's, 100's, and so on). Hexadecimal is handy for bit patterns because each hex digit corresponds to four bits, unlike decimal, where there is no simple correspondence between digits and bits.


| hex digit | 	decimal number |	binary number |
| --- | --- | --- |
| 0	 | 0	 | 0000 |
| 1	 | 1 | 0001 |
| 2	 | 2 | 0010 |
| 3	| 3	| 0011 |
| 4	 | 4 |	0100 |
| 5 |	5	| 0101 |
| 6 | 6	 | 0110 |
| 7	 | 7 | 	0111 |
| 8	| 8 | 	1000 |
| 9	| 9	 | 1001 |
| a	 | 10  | 1010 |
| b	 | 11	| 1011 |
| c	| 12	| 1100 |
| d	| 13	| 1101 |
| e	| 14	| 1110 |
| f	| 15	| 1111 |

In the C language, you can represent constants in hexadecimal by preceding them with 0x. So, for example, 0x41 represents the decimal number 65. (4 × 16 + 1 = 65.) Similarly, 0x2a4 = 676. (2 × 256 + 10 × 16 + 4 = 676.)

You might consider printing individual hex digits with a *lookup table*

c const char hexlookup[16] = “0123456789abcdef”; ```

Note