Bit Operations - Simple Programming - Practical C Programming, 3rd Edition (2011)

Practical C Programming, 3rd Edition (2011)

Part II. Simple Programming

Chapter 11. Bit Operations

To be or not to be, that is the question

—— Shakespeare, on boolean algebra [Hamlet, Act 3, Scene 1]

This chapter discusses bit-oriented operations. A bit is the smallest unit of information. Normally, it is represented by the values 1 and 0. (Other representations include on/off, true/false, and yes/no.) Bit manipulations are used to control the machine at the lowest level. They allow the programmer to get under the hood of the machine. Many higher-level programs will never need bit operations. Low-level coding, like writing device drivers or pixel-level graphic programming, requires bit operations.

Eight bits together form a byte, represented by the C data type char.[13]

A byte might contain the following bits:

01100100

This bit structure can also be written as the hexadecimal number 0x64. (C uses the prefix “0x” to indicate a hexadecimal (base 16) number.) Hexadecimal is convenient for representing binary data because each hexadecimal digit represents 4 binary bits. Table 11-1 gives the hexadecimal-to-binary conversion:

Table 11-1. Hexadecimal and Binary

Hexadecimal

Binary

Hexadecimal

Binary

0

0000

8

1000

1

0001

9

1001

2

0010

A

1010

3

0011

B

1011

4

0100

C

1100

5

0101

D

1101

6

0110

E

1110

7

0111

F

1111

So the hexadecimal number 0xAF represents the binary number 10101111.

The printf format for hexadecimal is %x; for octal the format is %o. So:

int number = 0xAF;

printf("Number is %x %d %o\n", number, number, number);

prints:

af 175 257

Many novice programmers get a number confused with its representation. They ask questions like, “How does a number know if it’s a hexadecimal number or decimal?”

To illustrate the difference between these two concepts, consider a bag of marbles as illustrated in Figure 11-1.

A bag of marbles

Figure 11-1. A bag of marbles

This bag contains 17 marbles. But is that hexadecimal 0x11, decimal 17, octal 021, or binary 10001? The answer is that the number is 17 no matter what. How we choose to represent it (octal, decimal, hex, or binary) is up to us. The marbles don’t care how we count them.

Bit Operators

Bit operators allow the programmer to work on individual bits. For example, a short integer holds 16 bits (on most machines). The bit operators treat each bit as independent. By contrast, an add operator treats the 16 bits as a single 16-bit number.

Bit operators allow the programmer to set, clear, test, and perform other operations on bits. The bit operators are listed in Table 11-2.

Table 11-2. Bitwise Operators

Operator

Meaning

&

Bitwise and

|

Bitwise or

^

Bitwise exclusive or

~

Complement

<<

Shift left

>>

Shift right

These operators work on any integer or character data type.


[13] Technically speaking, the C standard does not specify the number of bits in a character. However, on every machine I know of, a C character is 8 bits.

The and Operator (&)

The and operator compares two bits. If they both are one, the result is one. The results of the and operator are defined according to Table 11-3.

Table 11-3. and Operator

Bit1

Bit2

Bit1 & Bit2

0

0

0

0

1

0

1

0

0

1

1

1

When two eight-bit variables (char variables) are “anded” together, the and operator works on each bit independently. The following program segment illustrates this operation:

int c1, c2;

c1 = 0x45;

c2 = 0x71;

printf("Result of %x & %x = %x\n", c1, c2, (c1 & c2));

The output of this program is:

Result of 45 & 71 = 41

This is because:

c1 = 0x45 binary 01000101

& c2 = 0x71 binary 01110001

_______________________________

= 0x41 binary 01000001

The bitwise and (&) is similar to the logical and (&&). In the logical and, if both operands are true (nonzero), the result is true (1). In the bitwise and (&), if the corresponding bits of both the operands are true (ones), then the corresponding bits of the results are true (ones). So, the bitwise and (&) works on each bit independently while the logical and (&&) works on the operands as a whole.

However & and && are different operators, as Example 11-1 illustrates.

Example 11-1. and/and.c

#include <stdio.h>

int main()

{

int i1, i2; /* two random integers */

i1 = 4;

i2 = 2;

/* Nice way of writing the conditional */

if ((i1 != 0) && (i2 != 0))

printf("Both are not zero\n");

/* Shorthand way of doing the same thing */

/* Correct C code, but rotten style */

if (i1 && i2)

printf("Both are not zero\n");

/* Incorrect use of bitwise and resulting in an error */

if (i1 & i2)

printf("Both are not zero\n");

return (0);

}

Question: Why does test #3 fail to print Both are not zero?

Answer: The operator & is a bitwise and. The result of the bitwise and is zero.

i1=4 00000100

i2=2 00000010

___________________

& 00000000

The result of the bitwise and is 0, and the conditional is false. If the programmer had used the first form:

if ((i1 != 0) && (i2 != 0))

and made the mistake of using & instead of &&:

if ((i1 != 0) & (i2 != 0))

the program would still have executed correctly.

(i1 != 0)

is true (result = 1)

(i2 != 0)

is true (result = 1)

1 bitwise and 1 is 1 so the expression is true.[14]

You can use the bitwise and operator to test if a number is even or odd. In base 2, the last digit of all even numbers is zero and the last digit of all odd numbers is one. The following function uses the bitwise and to pick off this last digit. If it is zero (an even number), the result of the function is true.

int even(const int value)

{

return ((value & 1) == 0);

}

NOTE

This procedure uses a programming technique known by the technical name “a cute trick.” Normally, cute tricks should be avoided whenever possible. Better programming practice would be to use the modulus (%) operator in this case. The only reason we’ve used the and operator (&) is because we are discussing it in this section.


[14] Soon after I discovered the bug illustrated by this program, I told my office mate, “I now understand the difference between and and and and,” and he understood me. How we understand language has always fascinated me, and the fact that I could utter such a sentence and have someone understand it without trouble amazed me.

Bitwise or (|)

The inclusive or operator (also known as just the or operator) compares its two operands and if one or the other bit is a one, the result is a one. Table 11-4 lists the truth table for the or operator.

Table 11-4. or Operator

Bit1

Bit2

Bit1 | Bit2

0

0

0

0

1

1

1

0

1

1

1

1

On a byte, this would be:

i1=0x47 01000111

| i2=0x53 01010011

_______________________

= 0x57 01010111

The Bitwise Exclusive or (^)

The exclusive or (also known as xor) operator results in a one when either of its two operands is a one, but not both. The truth table for the exclusive or operator is listed in Table 11-5.

Table 11-5. Exclusive or

Bit1

Bit2

Bit1 ^ Bit2

0

0

0

0

1

1

1

0

1

1

1

0

On a byte this would be:

i1=0x47 01000111

^ i2=0x53 01010011

_______________________

= 0x14 00010100

The Ones Complement Operator (Not) (~)

The not operator (also called the invert operator, or bit flip) is a unary operator that returns the inverse of its operand as shown in Table 11-6.

Table 11-6. not Operator

Bit

~Bit

0

1

1

0

On a byte, this is:

c= 0x45 01000101

______________________

~c= 0xBA 10111010

The Left- and Right-Shift Operators (<<, >>)

The left-shift operator moves the data to the left a specified number of bits. Any bits that are shifted out the left side disappear. New bits coming in from the right are zeros. The right shift does the same thing in the other direction. For example:

c=0x1C

00011100

c << 1

c=0x38

00111000

c >> 2

c=0x07

00000111

Shifting left by one (x << 1) is the same as multiplying by 2 (x * 2). Shifting left by two (x << 2) is the same as multiplying by 4 (x * 4, or x * 22). You can see a pattern forming here. Shifting left by n places is the same as multiplying by 2n. Why shift instead of multiply? Shifting is faster than multiplication, so:

i = j << 3; /* Multiple j by 8 (2**3) */

is faster than:

i = j * 8;

Or it would be faster if compilers weren’t smart enough to turn “multiply by power of two” into “shift.”

Many clever programmers use this trick to speed up their programs at the cost of clarity. Don’t you do it. The compiler is smart enough to perform the speedup automatically. The result of putting in a shift gains you nothing at the expense of clarity.

The left-shift operator multiplies; the right-shift operator divides. So:

q = i >> 2;

is the same as:

q = i / 4;

Again, this trick is clever but should not be used in modern code.

Right-Shift Details

Right shifts are particularly tricky. When a variable is shifted to the right, C needs to fill the space on the left-hand side with something. For signed variables, C uses the value of the sign bit. For unsigned variables, C uses zero. Table 11-7 illustrates some typical shifts.

Table 11-7. Right-Shift Examples

signed char

signed char

unsigned char

Expression

9 >> 2

-8 >> 2

248 >> 2

Binary Value >> 2

0000 1001 >> 2

1111 1000 >> 2

1111 1000 >> 2

Result

??00 0010

??11 1110

??11 1110

Fill

Sign Bit (0)

Sign Bit (1)[a]

Zero

Final Result (Binary)

0000 0010

1111 1110

0011 1110

Final Result (short int)

2

-2

62

[a] The ANSI standard specifies that right shifts may be arithmetic (sign bit fill) or logical (zero bit fill). Almost all compilers use the arithmetic right shift.

Setting, Clearing, and Testing Bits

A character (char) contains eight bits. Each of these can be treated as a separate flag.

Bit operations can be used to pack eight single-bit values in a single byte. For example, suppose we are writing a low-level communications program. We are going to store the characters in an 8k buffer for later use. With each character we will also store a set of status flags. The flags are listed in Table 11-8.

Table 11-8. Communications Status Values

Name

Description

ERROR

True if any error is set.

FRAMING_ERROR

A framing error occurred for this character.

PARITY_ERROR

Character had the wrong parity.

CARRIER_LOST

The carrier signal went down.

CHANNEL_DOWN

Power was lost on the communication device.

We could store each of these flags in its own character variable. That format would mean that for each character buffered, we would need five bytes of status storage. For a large buffer, that storage adds up. If we instead assign each status flag its own bit within an 8-bit status character, we cut our storage requirements down to one-fifth of our original need.

We assign our flags the bit numbers listed in Table 11-9.

Table 11-9. Bit Assignments

Bit

Name

0

ERROR

1

FRAMING_ERROR

2

PARITY_ERROR

3

CARRIER_LOST

4

CHANNEL_DOWN

Bits are numbered 76543210 by convention, as seen in Table 11-9. In the Bit numbers table, we have bits 4 and 1 set.

Bit numbers

7

6

5

4

3

2

1

0

0

0

0

1

0

0

1

0

The constants for each bit are defined in Table 11-10.

Table 11-10. Bit Values

Bit

Binary value

Hexadecimal constant

7

10000000

0x80

6

01000000

0x40

5

00100000

0x20

4

00010000

0x10

3

00001000

0x08

2

00000100

0x04

1

00000010

0x02

0

00000001

0x01

The definitions could be:

/* True if any error is set */

const int ERROR = 0x01;

/* A framing error occurred for this character */

const int FRAMING_ERROR = 0x02;

/* Character had the wrong parity */

const int PARITY_ERROR = 0x04;

/* The carrier signal went down */

const int CARRIER_LOST = 0x08;

/* Power was lost on the communication device */

const int CHANNEL_DOWN = 0x10;

This method of defining bits is somewhat confusing. Can you tell (without looking at the table) which bit number is represented by the constant 0x10? Table 11-11 shows how we can use the left-shift operator (<<) to define bits.

Table 11-11. Left-Shift Operator and Bit Definition

C Representation

Base 2 Equivalent

Result (Base 2)

Bit Number

1<<0

00000001 << 0

00000001

Bit 0

1<<1

00000001 << 1

00000010

Bit 1

1<<2

00000001 << 2

00000100

Bit 2

1<<3

00000001 << 3

00001000

Bit 3

1<<4

00000001 << 4

00010000

Bit 4

1<<5

00000001 << 5

00100000

Bit 5

1<<6

00000001 << 6

01000000

Bit 6

1<<7

00000001 << 7

10000000

Bit 7

Although you cannot easily tell what bit is represented by 0x10, you can easily tell what bit is meant by 1<<4.

So our flags can be defined as:

/* True if any error is set */

const int ERROR = (1<<0);

/* A framing error occurred for this character */

const int FRAMING_ERROR = (1<<1);

/* Character had the wrong parity */

const int PARITY_ERROR = (1<<2);

/* The carrier signal went down */

const int CARRIER_LOST = (1<<3);

/* Power was lost on the communication device */

const int CHANNEL_DOWN = (1<<4);

Now that we have defined the bits, we can manipulate them. To set a bit, use the | operator. For example:

char flags = 0; /* start all flags at 0 */

flags |= CHANNEL_DOWN; /* Channel just died */

To test a bit, we use the & operator to mask out the bits:

if ((flags & ERROR) != 0)

printf("Error flag is set\n");

else

printf("No error detected\n");

Clearing a bit is a harder task. Suppose we want to clear the bit PARITY_ERROR. In binary this bit is 00000100. We want to create a mask that has all bits set except for the bit we want to clear (11111011). This step is done with the not operator (~). The mask is then anded with the number to clear the bit.

PARITY_ERROR 00000100

~PARITY_ERROR 11111011

flags 00000101

_______________________________________

flags & ~PARITY_ERROR 00000001

In C, you should use:

flags &= ~PARITY_ERROR; /* Who cares about parity */

Question 11-1: In Example 11-2, the HIGH_SPEED flag works, but the DIRECT_CONNECT flag does not. Why? (Click here for the answer Section 11.9)

Example 11-2. high/high.c

#include <stdio.h>

const int HIGH_SPEED = (1<<7); /* modem is running fast */

/* we are using a hardwired connection */

const int DIRECT_CONNECT = (1<<8);

char flags = 0; /* start with nothing */

int main()

{

flags |= HIGH_SPEED; /* we are running fast */

flags |= DIRECT_CONNECT;/* because we are wired together */

if ((flags & HIGH_SPEED) != 0)

printf("High speed set\n");

if ((flags & DIRECT_CONNECT) != 0)

printf("Direct connect set\n");

return (0);

}

Bitmapped Graphics

More and more computers now have graphics. For the PC, there are graphics devices like EGA and VGA cards. UNIX offers the X windowing system.

In bitmapped graphics, each pixel on the screen is represented by a single bit in memory. For example, Figure 11-2 shows a 14-by-14 bitmap as it appears on the screen, and enlarged so that you can see the bits.

Bitmap, actual size and enlarged

Figure 11-2. Bitmap, actual size and enlarged

Suppose we have a small graphic device—a 16-by-16-pixel black-and-white display. We want to set the bit at 4,7. The bitmap for this device is shown as an array of bits in Figure 11-3.

Array of bits

Figure 11-3. Array of bits

But we have a problem. No data type exists for an array of bits in C. The closest we can come is an array of bytes. Our 16-by-16 array of bits now becomes a 2-by-16 array of bytes, as shown in Figure 11-4.

Array of bytes

Figure 11-4. Array of bytes

Let’s see what we need to do to transform our x, y index to a byte_x, byte_y, bit_index, and bit.

The byte_y index is the same as the y index. That transformation is simple:

byte_y = y;

A byte contains 8 bits. So, in the X direction, our byte index is eight times our bit index. This transformation leaves us with the code:

byte_x = x / 8;

Now we need the bit index. The index starts out at 0, goes to 7, and then goes back to 0. This change gives us the code:

bit_index = x % 8;

Now to specify the bit itself. A bit index of zero indicates the left-most bit or the bit represented by 1000 00002 or 0x80. A bit index of 1 indicates the next-to-the-left-most bit 0100 00002 or 0x80 >> 1. So the bit we want is given by the expression:

bit = 0x80 >> bit_index;

The full algorithm looks like:

byte_y = y;

byte_x = x / 8;

bit_index = x % 8;

bit = 0x80 >> bit_index;

graphics[byte_x][byte_y] |= bit;

This algorithm can be condensed into a single macro:

#define set_bit(x, y) graphics[(x)/8][y] |= (0x80 >> ((x)%8))

For example, to set the pixel at bit number 4,7, we need to set the fourth bit of byte 0,7. Our macro would generate the statement:

bit_array[0][7] |= (0x80 >> (4));

Example 11-3 draws a diagonal line across the graphics array and then prints the array on the terminal.

Example 11-3. graph/graph.c

#include <stdio.h>

#define X_SIZE 40 /* size of array in X direction */

#define Y_SIZE 60 /* size of array in Y direction */

/*

* We use X_SIZE/8 because we pack 8 bits per byte

*/

char graphics[X_SIZE / 8][Y_SIZE]; /* the graphics data */

#define SET_BIT(x,y) graphics[(x)/8][y] |= (0x80 >>((x)%8))

int main()

{

int loc; /* current location we are setting */

void print_graphics(void); /* print the data */

for (loc = 0; loc < X_SIZE; ++loc)

SET_BIT(loc, loc);

print_graphics();

return (0);

}

/********************************************************

* print_graphics -- Prints the graphics bit array *

* as a set of X and .'s. *

********************************************************/

void print_graphics(void)

{

int x; /* current x BYTE */

int y; /* current y location */

unsigned int bit; /* bit we are testing in the current byte */

for (y = 0; y < Y_SIZE; ++y) {

/* Loop for each byte in the array */

for (x = 0; x < X_SIZE / 8; ++x) {

/* Handle each bit */

for (bit = 0x80; bit > 0; bit = (bit >> 1)) {

if ((graphics[x][y] & bit) != 0)

printf("X");

else

printf(".");

}

}

printf("\n");

}

}

The program defines a bitmapped graphic array:

char graphics[X_SIZE / 8][Y_SIZE]; /* the graphics data */

The constant X_SIZE/8 is used because we have X_SIZE bits across, which translates to X_SIZE/8 bytes.

The main for loop:

for (loc = 0; loc < X_SIZE; ++loc)

set_bit(loc, loc);

draws a diagonal line across the graphics array.

Because we do not have a bitmapped graphics device, we will simulate it with the subroutine print_graphics.

The loop:

for (y = 0; y < Y_SIZE; ++y) {

....

prints each row. The loop:

for (x = 0; x < X_SIZE / 8; ++x) {

...

goes through every byte in the row. There are eight bits in each byte handled by the loop:

for (bit = 0x80; bit > 0; bit = (bit >> 1))

which uses an unusual loop counter. This loop causes the variable bit to start with bit 7 (the left-most bit). For each iteration of the loop, the bit is moved to the right one bit by bit = (bit >> 1). When we run out of bits, the loop exits.

The loop counter cycles through:

Binary

Hex

0000 0000 1000 0000

0x80

0000 0000 0100 0000

0x40

0000 0000 0010 0000

0x20

0000 0000 0001 0000

0x10

0000 0000 0000 1000

0x08

0000 0000 0000 0100

0x04

0000 0000 0000 0010

0x02

0000 0000 0000 0001

0x01

Finally, at the heart of the loops is the code:

if ((graphics[x][y] & bit) != 0)

printf("X");

else

printf(".");

This code tests an individual bit and writes an “X” if it is set or a “.” if it is not.

Question 11-2: In Example 11-4 the first loop works, but the second one fails. Why? (Click here for the answer Section 11.9)

Example 11-4. loop/loop.c

#include <stdio.h>

int main()

{

short int i; /* Loop counter */

signed char ch; /* Loop counter of another kind */

/* Works */

for (i = 0x80; i != 0; i = (i >> 1)) {

printf("i is %x (%d)\n", i, i);

}

/* Fails */

for (ch = 0x80; ch != 0; ch = (ch >> 1)) {

printf("ch is %x (%d)\n", ch, ch);

}

return (0);

}

Answers

Answer 11-1: DIRECT_CONNECT is defined to be bit number 8 by the expression (1<<8); however, the eight bits in a character variable are numbered 76543210. Bit number 8 does not exist. A solution to this problem is to make flags a short integer with 16 bits.

Answer 11-2: The problem is that ch is a character (8 bits). The value 0x80 represented in 8 bits is 1000 00002. The first bit, the sign bit, is set. When a right shift is done on this variable, the sign bit is used for fill. So, 1000 00002>>1 is 1100 00002.

The variable i works even though it is signed because it is 16 bits long. So 0x80 in 16 bits is 0000 0000 1000 00002. Notice that the set bit is not near the sign bit.

The solution to our problem is to declare ch as an unsigned variable.

Programming Exercises

Exercise 11-1: Write a set of parameterized macros, clear_bit and test_bit, to go with the set_bit operation defined in Example 11-3. Write a main program to test these macros.

Exercise 11-2: Write a program to draw a 10-by-10 bitmapped square. You can borrow the code from Example 11-3 to print the results.

Exercise 11-3: Change Example 11-3 so that it draws a white line across a black background.

Exercise 11-4: Write a program that counts the number of bits set in an integer. For example, the number 5 (decimal), which is 0000000000000101 (binary), has two bits set.

Exercise 11-5: Write a program that takes a 32-bit integer (long int) and splits it into eight 4-bit values. (Be careful of the sign bit.)

Exercise 11-6: Write a program that will take the bits in a number and shift them to the left end. For example, 01010110 (binary) would become 11110000 (binary).