C++ Recipes: A Problem-Solution Approach (2015)
CHAPTER 4
Working with Numbers
Computers are designed and built to crunch numbers. The programs you write will take advantage of the computational power of computers to provide experiences to users that are completely dependent on your ability to understand and utilize the tools provided by C++ to manipulate numbers. C++ provides support for different types of numbers, this support includes whole numbers and real numbers as well as multiple different ways of storing and representing these.
The C++ integer types will be used to store whole numbers and the floating point types will be used to store real numbers with decimal points. There are different tradeoffs and considerations to be taken into account when using each type of number in C++ and this chapter will introduce you to different challenges and scenarios where each type is appropriate. You’ll also see an older technique named fixed point arithmetic that can use integer types to approximate floating point types.
Recipe 4-1. Using the Integer Types in C++
Problem
You need to represent whole numbers in your program but are unsure of the limitations and capabilities of the different integer types.
Solution
Learning about the different integer types supported by C++ will allow you to use the correct type for the task at hand.
How It Works
Working with the int Type
C++ provides an exact representation of the different integer types supported by modern processors. All of the integer types behave in exactly the same way however they may contain more or less data than each other. Listing 4-1 shows how to define an integer variable in C++.
Listing 4-1. Defining an integer
int main(int argc, char* argv[])
{
int wholeNumber{ 64 };
return 0;
}
As you can see an integer is defined using the int type in C++. The int type in C++ can be used in conjunction with standard arithmetic operators that allow you to add, subtract, multiply, divide and take the modulus. Listing 4-2 uses these operator to initialize additional integer variables.
Listing 4-2. Initializing integerss using opertors
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
int wholeNumber1{ 64 };
cout << "wholeNumber1 equals " << wholeNumber1 << endl;
int wholeNumber2{ wholeNumber1 + 32 };
cout << "wholeNumber2 equals " << wholeNumber2 << endl;
int wholeNumber3{ wholeNumber2 - wholeNumber1 };
cout << "wholeNumber3 equals " << wholeNumber3 << endl;
int wholeNumber4{ wholeNumber2 * wholeNumber1 };
cout << "wholeNumber4 equals " << wholeNumber4 << endl;
int wholeNumber5{ wholeNumber4 / wholeNumber1 };
cout << "wholeNumber5 equals " << wholeNumber5 << endl;
int wholeNumber6{ wholeNumber4 % wholeNumber1 };
cout << "wholeNumber6 equals " << wholeNumber6 << endl;
return 0;
}
The code in Listing 4-2 contains lines that use operators to initialize additional integers. The operators can be used in a number of ways. You can see that the operators can have either literal values such as 32 or other variables on either side. Figure 4-1 shows the output from this program.
Figure 4-1. The output from running the code in Listing 4-2
The output from Listing 4-2 is shown in Figure 4-1. The following list explains how the values shown in the output ends up in each variable.
· The variable wholeNumber1 was initialized with the value of 64 and therefore the output is 64.
· The literal 32 is added to the value of wholeNumber1 and stored in wholeNumber2 therefore the output in 96.
· The next line outputs 32 as the code has subtracted wholeNumber2 from wholeNumber1. The effect of this is that we have managed to store the literal value from the initialization of wholeNumber2 in the variable wholeNumber3.
· The value of wholeNumber4 is output as 6144 which is the result of 64*96.
· The program prints the value of 96 for wholeNumber5 as it is the result of dividing 6144 by 64 or the value of wholeNumber4 divided by the value of wholeNumber1.
· The value of wholeNumber6 is output as 32. The modulo operator returns the remainder from a division. In this case the remainder of 96/64 is 32 therefore the modulo operator has returned 32.
Working with Different Types of Integers
The C++ programming language provides support for different types of integers. Table 4-1 shows the different types of integers and their properties.
Table 4-1. The C++ integer types
Table 4-1 lists the five main types that C++ supplies to work with whole numbers. The problem C++ presents is that these types are not always guaranteed to represent the number of bytes as shown in Table 4-1. This is because the C++ standard leaves the decision of how many bytes represents up to the platform. The situation isn’t entirely the fault of C++. Processor manufacturers may choose to represent integers using different numbers of bytes and therefore the compiler writers for those platforms are free to alter the types to suit their processor by the standard. You can however write code that guarantees the number of bytes in your integers by using the cinttypes header. Table 4-2 shows the different integers available through cinttypes.
Table 4-2. The cinttypes integers
The types supplied by cinttypes contain the number of bits that they represent. Given that there are 8 bits in a byte you can see the relationship by the type and the number of bytes in Table 4-2. Listing 4-3 uses the same operators as Listing 4-2 but is updated to use the int32_t typein place of int.
Listing 4-3. Using the int32_t type with operators
#include <iostream>
#include <cinttypes>
using namespace std;
int main(int argc, char* argv[])
{
int32_t whole32BitNumber1{ 64 };
cout << "whole32BitNumber1 equals " << whole32BitNumber1 << endl;
int32_t whole32BitNumber2{ whole32BitNumber1 + 32 };
cout << "whole32BitNumber2 equals " << whole32BitNumber2 << endl;
int32_t whole32BitNumber3{ whole32BitNumber2 - whole32BitNumber1 };
cout << "whole32BitNumber3 equals " << whole32BitNumber3 << endl;
int32_t whole32BitNumber4{ whole32BitNumber2 * whole32BitNumber1 };
cout << "whole32BitNumber4 equals " << whole32BitNumber4 << endl;
int32_t whole32BitNumber5{ whole32BitNumber4 / whole32BitNumber1 };
cout << "whole32BitNumber5 equals " << whole32BitNumber5 << endl;
int whole32BitNumber6{ whole32BitNumber2 % whole32BitNumber1 };
cout << "whole32BitNumber6 equals " << whole32BitNumber6 << endl;
return 0;
}
The output resulting from this code is similar to that of Figure 4-1 as you can see in Figure 4-2.
Figure 4-2. The output when using the int32_t and code from Listing 4-2
Working with Unsigned Integers
Each of the types shown in Table 4-1 and Table 4-2 have unsigned counterparts. Using an unsigned version of the type means that you will no longer have access to negative numbers however you will have a much longer range of positive numbers represented by the same number of bytes. You can see the C++ standard unsigned types in Table 4-3.
Table 4-3. C++’s built-in unsigned types
The unsigned numbers store the same range of numbers as their signed counterparts. Both a signed char and an unsigned char can store 256 unique values. The signed char stores values from -128 to 127 while the unsigned version stores the 256 values from 0 to 255. The built-in unsigned types suffer from the same problem as the signed types, they may not represent the same number of bytes on different platforms. C++’s cinttypes header file provides unsigned types that guarantee their size. Table 4-4 documents these types.
Table 4-4. The cintypes header file’s unsigned ineteger types
Recipe 4-2. Making Decisions with Relational Operators
Problem
You are writing a program and must make a decision based on the result of a comparison between two values.
Solution
C++ provides relational operators that return true or false based on the comparison being calculated.
How It Works
C++ provides four major relational operators. These are:
· The equality operator
· The inequality operator
· The greater-than operator
· The less-than operator
These operators allow you to quickly compare two values and determine whether the result is true or false. The result of a true or false comparison can be stored in the bool type provided by C++. A bool can only represent either true or false.
The Equality Operator
Listing 4-4 shows the equality operator in use.
Listing 4-4. The C++ equality operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
int32_t equal1{ 10 };
int32_t equal2{ 10 };
bool isEqual = equal1 == equal2;
cout << "Are the numbers equal? " << isEqual << endl;
int32_t notEqual1{ 10 };
int32_t notEqual2{ 100 };
bool isNotEqual = notEqual1 == notEqual2;
cout << "Are the numbers equal? " << isNotEqual << endl;
return 0;
}
The code in Listing 4-4 generates the output shown in Figure 4-3.
Figure 4-3. Output from the relational equality operator
The equality operator will set a bool variable’s value to true (represented by 1 in the output) in the event of the values on both sides of the operator being the same. This is the case where Listing 4-4 compares equal1 to equal2. The result of the operator is false when the values on both sides are different as when the code compares notEqual1 to notEqual2.
The Inequality Operator
The inequality operator is used to determine when numbers are not equal. Listing 4-5 shows the inequality operator in use.
Listing 4-5. The Inequality Operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
int32_t equal1{ 10 };
int32_t equal2{ 10 };
bool isEqual = equal1 != equal2;
cout << "Are the numbers not equal? " << isEqual << endl;
int32_t notEqual1{ 10 };
int32_t notEqual2{ 100 };
bool isNotEqual = notEqual1 != notEqual2;
cout << "Are the numbers not equal? " << isNotEqual << endl;
return 0;
}
The output generated by Listing 4-5 is shown in Figure 4-4.
Figure 4-4. The output from Listing 4-5 showing the results of the inequality operator
You can see from Listing 4-5 and Figure 4-4 that the inequality operator will return true when the values are not equal and false when the values are equal.
The Greater-than Operator
The greater-than operator can tell you whether the number on the left is greater-than the number on the right. Listing 4-6 shows this in action.
Listing 4-6. The greater-than operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
int32_t greaterThan1{ 10 };
int32_t greaterThan2{ 1 };
bool isGreaterThan = greaterThan1 > greaterThan2;
cout << "Is the left greater than the right? " << isGreaterThan << endl;
int32_t notGreaterThan1{ 10 };
int32_t notGreaterThan2{ 100 };
bool isNotGreaterThan = notGreaterThan1 > notGreaterThan2;
cout << "Is the left greater than the right? " << isNotGreaterThan << endl;
return 0;
}
The greater-than operator sets the value of a bool to be either true or false. The result will be true when the number on the left is greater than the number on the right and false when the number on the right is greater than that on the left. Figure 4-5 shows the output generated by Listing 4-6.
Figure 4-5. The output generated by Listing 4-6
The Less-than Operator
The less-than operator produces the opposite result of the greater than operator. The less-than operator returns true when the number of the left is less than that on the right. Listing 4-7 shows the operator in use.
Listing 4-7. The Less-than operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
int32_t lessThan1{ 1 };
int32_t lessThan2{ 10 };
bool isLessThan = lessThan1 < lessThan2;
cout << "Is the left less than the right? " << isLessThan << endl;
int32_t notLessThan1{ 100 };
int32_t notLessThan2{ 10 };
bool isNotLessThan = notLessThan1 < notLessThan2;
cout << "Is the left less than the right? " << isNotLessThan << endl;
return 0;
}
Figure 4-6 shows the results when the code in Listing 4-7 is executed.
Figure 4-6. The output generated when the less-than operator is used in Listing 4-7
Recipe 4-3. Chaining Decisions with Logical Operators
Problem
Sometimes your code will require that multiple conditions are satisfied in order to set a Boolean value to true.
Solution
C++ provides logical operators that allow the chaining of relational statements.
How It Works
C++ provides two logical operators that allow the chaining of multiple relational statements. These are:
· The && (and) Operator
· The || (or) Operator
The && Operator
The && operator is used when you would like to determine that two different relational operators are both true. Listing 4-8 shows the && operator in use.
Listing 4-8. The Logical && Operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
bool isTrue { (10 == 10) && (12 == 12) };
cout << "True? " << isTrue << endl;
bool isFalse = isTrue && (1 == 2);
cout << "True? " << isFalse << endl;
return 0;
}
The value of isTrue is set to true because both of the relational operations result in a true value. The value of isFalse is set to false because both of the relational statements do not result in a true statement. The output of these operations can be seen in Figure 4-7.
Figure 4-7. The Logical && Operator output generated by Listing 4-8
The Logical || Operator
The logical || operator is used to determine when either or both of the statements used are true. Listing 4-9 contains code that tests the results of the || operator.
Listing 4-9. The Logical || Operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
bool isTrue { (1 == 1) || (0 == 1) };
cout << "True? " << isTrue << endl;
isTrue = (0 == 1) || (1 == 1);
cout << "True? " << isTrue << endl;
isTrue = (1 == 1) || (1 == 1);
cout << "True? " << isTrue << endl;
isTrue = (0 == 1) || (1 == 0);
cout << "True? " << isTrue << endl;
return 0;
}
The resulting output generated by this code can be seen in Figure 4-8.
Figure 4-8. The output generated when using logical || operators
Listing 4-9 proves that the logical || operator will return true whenever either or both of the relational operations are also true. When both are false the || operator will also return false.
Note There is a commonly used optimization when using logical operators. Execution will end as soon as the operator is satisfied. This means that a || operator will not evaluate the second term when the first is true and the && operator will not evaluate the second term when the first is false. Be wary of this when calling functions in the right side statement that have secondary effects outside of their Boolean return value.
Recipe 4-4. Using Hexadecimal Values
Problem
You are working with code that contains hexadecimal value and you need to understand how they work.
Solution
C++ allows the use of hexadecimal values in code and programmers routinely use hex values when writing out binary representation of numbers.
How It Works
Computer processors use a binary representation to store numbers in memory and used binary instructions to test and modify these values. Due to its low level nature, C++ provides bitwise operators that can operate on the bits in variables exactly as a processor would. A bit of information can either be a 1 or a 0. We can construct higher numbers by using chains of bits. A single bit can represent the digits 1 or 0. Two bits however can represent 0, 1, 2 or 3. This can be achieved because two bits can represent four unique signals; 00, 01, 10 and 11. The C++ int8_t data type is made up of 8 bits. The data in Table 4-5 shows how these different bits are represented numerically.
Table 4-5. The numerical values of bits in an 8bit variable
A uint8_t variable that stored the value represented by Table 4-5 would contain the number 137. In fact, an 8bit variable can store 256 individual values. You can work out the number of values a variable can store by raising the number 2 to the power of the number of bits i.e. 2^8 is 256.
Note Negative numbers are represented in signed types using the same number of bits as unsigned types. In Table 4-4, a signed value would lose the position at 128 to become a sign bit. You can convert a positive number to a negative using the Two’s Complement of the number. To do this you flip all of the bits and add 1. For a two bit number 1 you would have the binary representation 01. To get the Two’s Complement, and therefore the negative, firstly flip the bits to 10 then add 1 ending with 11. In an 8 bit value you’d follow the same process. 00000001 becomes 11111110 and adding 1 results in 11111111. No matter than number of bits in a variable, -1 is always represented in Two’s Complement by all bits being turned on, this is a useful fact to remember.
Writing bits out in their entirety quickly gets out of hand when dealing with 16, 32 and 64 bit numbers. Programmers tend to write binary representations in a hexadecimal format instead. Hex numbers are represented by the values 0-9 and, A, B, C, D, E and F. The values A-F represent the numbers 10 through 15. It takes 4 bits to represent the 16 hexadecimal values therefore we can now represent the bit pattern in Table 4-5 using the hexadecimal 0x89 where the 9 represents the lower 4 bits (8+1 is 9) and the 8 represents the higher 4 bits.
Listing 4-10 shows how you can use hexadecimal literals in your code and use cout to print them to the console.
Listing 4-10. Using Hexadecimal Literal Values
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
uint32_t hexValue{ 0x89 };
cout << "Decimal: " << hexValue << endl;
cout << hex << "Hexadecimal: " << hexValue << endl;
cout << showbase << hex << "Hexadecimal (with base): " << hexValue << endl;
return 0;
}
Hexadecimal literals in C++ are proceeded by 0x. This lets the compiler know that you intend for it to interpret the number in hex and not decimal. Figure 4-9 shows the effect of the different output flags used with cout in Listing 4-10.
Figure 4-9. Printing out hexadecimal values
The cout stream by default prints the decimal representation of integer variables. You must pass flags to cout to alter this behavior. The hex flag informs cout that it should print the number in hexadecimal however this does not automatically prepend the 0x base. If you wish your output to have the base on your hexadecimal numbers (and you usually will so that other users don’t read the value as decimal 89 instead of 137) you can use the showbase flag which will make cout add the 0x to your hex values.
Listing 4-10 stores the value of 0x89 in a 32bit integer type but the representation still only has an 8 bit value. The other 6 bits are implicitly 0. The proper 32bit representation of 137 would actually be 0x00000089.
Note While it’s acceptable to drop the 0s when they are implied however it is also common practice to print all 8 hex values out when a 32bit number is intended. This is more important when representing negative numbers such as -1. When using an int32_t 0xF would represent 16 or 0x0000000F where -1 would be 0xFFFFFFFF. Be sure you’re setting the value you really wanted when using hexadecimal values.
Recipe 4-5. Bit Twiddling with Binary Operators
Problem
You are developing an application where you would like to pack data into as small a format as possible.
Solution
You can use bitwise operators to set and test individual bits on a variable.
How It Works
C++ provides the following bitwise operators:
· The & (bitwise and) operator
· The | (bitwise or) operator
· The ^ (exclusive or) operator
· The << (left shift) operator
· The >> (right shift) operator
· The ~ (One’s Complement) operator
The & (Bitwise And) Operator
The bitwise & operator returns a value that has all of the bits that were set in both the left and right sides of the operator. Listing 4-11 shows an example of this in action.
Listing 4-11. The & operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
uint32_t bits{ 0x00011000 };
cout << showbase << hex;
cout << "Result of 0x00011000 & 0x00011000: " << (bits & bits) << endl;
cout << "Result of 0x00011000 & 0x11100111: " << (bits & ~bits) << endl;
return 0;
}
Listing 4-11 makes use of both the & and ~ operators. The fest use of & will result in the value 0x00011000 being output to the console. The second use of & is used in conjunction with ~. The ~ operator flips all of the bits therefore the output from this use of & will be 0. You can see this in Figure 4-10.
Figure 4-10. The output resulting from Listing 4-11
The | (Bitwise Or) Operator
The bitwise or operator returns a value that contains all of the set bits from the left and right side of the operator. This is true whether either or both of the values are set. The only time a 0 will be placed into a bit is when both the left and right side of the operator does not have that position set.Listing 4-12 shows the | operator in use.
Listing 4-12. The | Operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
uint32_t leftBits{ 0x00011000 };
uint32_t rightBits{ 0x00010100 };
cout << showbase << hex;
cout << "Result of 0x00011000 | 0x00010100: " << (leftBits | rightBits) << endl;
cout << "Result of 0x00011000 & 0x11100111: " << (leftBits | ~leftBits) << endl;
return 0;
}
The first use of | will result in the value 0x00011100 and the second will result in 0xFFFFFFFF. You can see that this is true in Figure 4-11.
Figure 4-11. The output generated by Listing 4-12
The values stored in leftBits and rightBits share a single bit position that is set to 1. There are two positions where one has a bit set and the other doesn’t. All three of these bits are set in the resulting value. The second use demonstrates that all bits are set so long as the bit position is set in one of the two places. The distinction between the two is important when you look at the results of the next operator.
The ^ (Exclusive Or) Operator
This operator will produce a single bit of difference between its output and the output of the | operator shown in Figure 4-11. This is because the exclusive or operator only sets the resulting bit to true when either the left or the right bit is set, not when both are set and not when neither are set. The first | operator in Listing 4-12 resulted in the value 0x00011100 being stored as the result. The ^ operator will result in 0x00001100 being stored when using the same values. Listing 4-13 shows the code for this scenario.
Listing 4-13. The ^ operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
uint32_t leftBits{ 0x00011000 };
uint32_t rightBits{ 0x00010100 };
cout << showbase << hex;
cout << "Result of 0x00011000 ^ 0x00010100: " << (leftBits ^ rightBits) << endl;
cout << "Result of 0x00011000 ^ 0x11100111: " << (leftBits ^ ~leftBits) << endl;
return 0;
}
The evidence of the different output produced can be seen in Figure 4-12.
Figure 4-12. The output generated by the ^ operator in Listing 4-13
The << and >> Operators
The left shift and right shift operators are handy tools that allow you to pack smaller sets of data into larger variables. Listing 4-14 shows code that shifts a value from the lower 16 bits of a uint32_t into the upper 16 bits.
Listing 4-14. Using the << operator
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
const uint32_t maskBits{ 16 };
uint32_t leftShifted{ 0x00001010 << maskBits };
cout << showbase << hex;
cout << "Left shifted: " << leftShifted << endl;
return 0;
}
This code results in the value 0x10100000 being stored in the variable leftShifted. This has freed up the lower 16 bits which you can now use to store another 16 bit value. Listing 4-15 uses the |= and & operators to do just that.
Note Each of the bitwise operators have an assignment variant for use in statements such as that in Listing 4-15.
Listing 4-15. Using a mask to pack values into a variable
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
const uint32_t maskBits{ 16 };
uint32_t leftShifted{ 0x00001010 << maskBits };
cout << showbase << hex;
cout << "Left shifted: " << leftShifted << endl;
uint32_t lowerMask{ 0x0000FFFF };
leftShifted |= (0x11110110 & lowerMask);
cout << "Packed left shifted: " << leftShifted << endl;
return 0;
}
This code now sees two separate 16 bit values being packed into a single 32 bit variable. The value packed into the lower 16 bits has all of its upper 16 bits masked out using the & operator in conjunction with a mask value, in this case 0x0000FFFF. This ensures that the |= operator leaves the values in the upper 16 bits unchanged by virtue of the fact that the value being or’d in won’t have any of those upper bits set. You can see this is true in Figure 4-13.
Figure 4-13. The results of masking values into integers using bitwise operators
The final two lines of output in Figure 4-13 are the result of operations to unmask the values from the lower and upper sections of the variable. You can see how this was achieved in Listing 4-16.
Listing 4-16. Unmasking packed data
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
const uint32_t maskBits{ 16 };
uint32_t leftShifted{ 0x00001010 << maskBits };
cout << showbase << hex;
cout << "Left shifted: " << leftShifted << endl;
uint32_t lowerMask{ 0x0000FFFF };
leftShifted |= (0x11110110 & lowerMask);
cout << "Packed left shifted: " << leftShifted << endl;
uint32_t lowerValue{ (leftShifted & lowerMask) };
cout << "Lower value unmasked: " << lowerValue << endl;
uint32_t upperValue{ (leftShifted >> maskBits) };
cout << "Upper value unmasked: " << upperValue << endl;
return 0;
}
The & operator and the >> operator are used in Listing 4-16 to retrieve the two distinct values from our packed variable. Unfortunately this code has an issue that has yet to be uncovered. Listing 4-17 provides an example of the issue.
Listing 4-17. Shifting and narrowing conversions
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
const uint32_t maskBits{ 16 };
uint32_t narrowingBits{ 0x00008000 << maskBits };
return 0;
}
The code in Listing 4-17 would fail to compile. You will receive an error that a narrowing conversion was going to take place and your compiler will prevent you from building your executable until the problem code is fixed. The problem here is that the value 0x00008000 has the 16th bit set and once it is shifted 16 bits to the right the 32nd bit would be set. This would cause the value to become a negative number under normal circumstances. At this stage you have two different options in your arsenal to deal with the situation.
Note Those of you who have used C++ before may have noticed that the samples are not using the = operator to initialize variables, such as uint32_t maskBits = 16; Instead I’m using uniform initialization that was introduced in C++11. Uniform initialization is the form of initialization using the {} operator as seen in these examples. The major benefit of uniform initialization is the protection from narrowing conversions that I’ve just described.
Listing 4-18 shows how you can use an unsigned literal to tell the compiler the value should be unsigned.
Listing 4-18. Using unsigned literals
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
const uint32_t maskBits{ 16 };
uint32_t leftShifted{ 0x00008080u << maskBits };
cout << showbase << hex;
cout << "Left shifted: " << leftShifted << endl;
uint32_t lowerMask{ 0x0000FFFF };
leftShifted |= (0x11110110 & lowerMask);
cout << "Packed left shifted: " << leftShifted << endl;
uint32_t lowerValue{ (leftShifted & lowerMask) };
cout << "Lower value unmasked: " << lowerValue << endl;
uint32_t upperValue{ (leftShifted >> maskBits) };
cout << "Upper value unmasked: " << upperValue << endl;
return 0;
}
Adding a u to the end of a numeric literal causes the compiler to evaluate that literal as an unsigned value. Another option would have been to use signed values instead. However this introduces a new consideration. When right shifting signed values the sign bit is placed into the new values coming in from the right. The following things can occur:
· 0x10100000 >> 16 becomes 0x00001010
· 0x80800000 >> 16 becomes 0xFFFF8080
Listing 4-19 and Figure 4-14 show code and output that proves the negative sign bit propagation.
Listing 4-19. Right shifting negative values
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
const uint32_t maskBits{ 16 };
int32_t leftShifted{ 0x00008080 << maskBits };
cout << showbase << hex;
cout << "Left shifted: " << leftShifted << endl;
int32_t lowerMask{ 0x0000FFFF };
leftShifted |= (0x11110110 & lowerMask);
cout << "Packed left shifted: " << leftShifted << endl;
int32_t rightShifted{ (leftShifted >> maskBits) };
cout << "Right shifted: " << rightShifted << endl;
cout << "Unmasked right shifted: " << (rightShifted & lowerMask) << endl;
return 0;
}
You can see the new code need two extract the upper masked value in the bold lines in Listing 4-19. A shift alone is no longer suitable when using signed integers. Figure 4-14 shows the output proving this point.
Figure 4-14. Output showing the sign bit propagation after a right shift
As you can see, I’ve had to shift the variable to the right and mask out the upper bits in order to retrieve the original value from the upper part of the variable. After our shift the value contained the decimal value -32,640 (0xFFFF8080) but the value we expected was actually 32,896 (0x00008080). 0x00008080 was retrieved by using the & operator (0xFFFF8080 | 0x0000FFFF = 0x00008080).