How Does the Computer Really Work? - Computer Science Programming Basics in Ruby (2013)

Computer Science Programming Basics in Ruby (2013)

Chapter 2. How Does the Computer Really Work?


§ Basic nomenclature and components of a computer system

§ Bit strings and their meaning

2.1 Introduction

This book is about the basics of computer science in which algorithms are expressed by software; it is not about computer hardware. However, as software is the tool by which we use the computer, a rudimentary knowledge of the properties of computers is desirable to understand the attributes, properties, limitations, and sometimes idiosyncrasies, of software.

2.2 Basic Nomenclature and Components of a Computer System

It may be argued that this brief introduction to hardware is unnecessary. The computer has become a utilitarian device, to be used by people who are nontechnical—the same way that a car can be used by all people, without any need to understand the workings of the engine, the various support systems, and the energy management of the car. This is true, but only partially.

Consider a hybrid car, such as the Toyota Prius. It is designed to be just like any other car: drivable without the intricate understanding needed to grasp the concept of the synergy drive of a car where multiple modes of propulsion cooperate to optimize the energy usage of this essentially electric car. However, the actual energy consumption differs between drivers. Those who understand the working of this car will get better energy efficiency than the casual driver—in our experience sometimes as high as a 15% difference.

We argue that the same concept is true for software. Understanding the underlying machinery (the computer system) enables more efficient software development. This may not be important for small tasks, but it may be crucial for very large ones.

A digital computer—and we limit ourselves to these only—is a device that has three main parts: at least one processing unit, called the central processing unit or CPU, at least one memory unit, and a control unit. A computer systemhas, in addition to a computer, a set of peripheral devices that can be roughly divided into three categories: user interface devices, mass storage devices, and communication devices.

Most of the computers today are based on the Von Neumann model of computing, which is as follows: the memory holds computer programs and all the data values that are necessary for the computer program. A computer programis built from instructions that are executed in a logical sequence. The computer operates by causing the control unit to select an instruction from memory. That instruction causes the control unit to fetch data from the memory to the processing unit. There may be one data item, more than one, or none. The processing unit then performs the function implied by the instruction, and the control unit saves the result in memory and selects the next instruction to be fetched from memory.

This is the gist of the Von Neumann model. In reality, there are various types of memory, very complex control units, and optionally multiple processing units that can deal with many instructions in parallel. There are many other optimizations, but no matter how complex, logically, there is a strict serialization imposed by this model, and the instructions seem to be performing serially.

The memory stores all its contents, be it instructions or data, as numbers. The representation of numbers that we use is called the radix or positional representation. To create such a representation, we choose a radix (sometimes called the base) of the representation, say, r. We select r symbols that have the values of 0 through r – 1. Numbers are then represented by a sequence of these symbols. Each position in the sequence has an ordinal (sequence position number), counted from right to left. Thus, the rightmost position has the ordinal 0, the next one has ordinal 1, and so on. The value of the represented number is then computed by multiplying the value of the symbol in position n by the weight or the factor of that position, that is, rn, and adding all values together.

In our familiar decimal system, the radix is 10. The 10 symbols that we use are 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. We call these symbols digits, carrying the values from zero to r - 1 which is 9. For example, to see what is represented by a three-digit number, say, 123, we compute the weight of each position. Position 0 will have the factor 100, which is 1, the second position has the factor 101, which is 10, and the third has the factor 102, which is 100. The value of the number is thus 3 × 1 + 2 × 10 + 1 × 100 = 123, as expected.


A computer system has two types of memory: short-term random access memory (RAM) that is fast but limited (usually measured in gigabytes with access times in nanoseconds) and long-term that is thousands of times slower but far more vast (often measured in terabytes with access times in milliseconds). Additionally, a computer has a few small brains or central processing units (CPUs) that execute instructions, input and output mediums, and a network connection to support the exchange of information between machines.

Assume now radix 4—that is, the base of our positional system is 4, usually called the quaternary system. We need four symbols that we choose to be 0, 1, 2, and 3, with the obvious values. These are our quaternary numerals.

What is the (decimal) value of our three-digit number 1234, where the subscript denotes that it is in base 4? The positions now have weights (factors) of 40 = 1, 41 = 4, and 42 = 16. The decimal value of our number is now 3 × 1 + 2 × 4 + 1 × 16, which is, in decimal, 27.

Another quaternary system, used heavily in genetics, uses the symbols A, C, G, and T, expressing the sequence of DNA nucleotides (A C G T) as a quaternary number, sometimes using its decimal value.

The prevalent numerical form used in our computers is based on the radix 2, and is called binary. In the binary system, every binary digit, called a bit, has one of two possible values, 0 or 1. The number stored in the memory is thus composed from a string of bits, each having a value of zero or one. The meaning of the string is decided by the way it is used; it may be interpreted in many ways, to be discussed later in this chapter.

Memory is built from cells, and each cell has its own unique address. Most computers use consecutive natural numbers, starting from zero, as addresses, sometimes called locations. In most computers, the addresses refer to cells that can hold eight bits—we refer to these cells as bytes. These bytes can be accessed in an arbitrary order, that is, the computer can select any address to read from or write into. For this reason, these memories are referred to as random access memories or RAM.

Bytes can be grouped into larger strings and accessed as an ordered string of bits, as will be apparent throughout this book. Modern computers have memories that hold billions of bytes (we will discuss sizes in the following section).

The peripheral devices that complement the computer to create a computer system are, as already mentioned, of three different categories. We sometimes also subdivide each category into input (to the computer), output (from the computer), or input/output or I/O devices.

The user interface devices used for input are, for example, keyboards, touch screens, microphones, and various sensors. Examples of output devices in this category are printers, screens, drawing and typing devices, light and audio devices, and various signaling devices.

Mass storage devices are designed to hold information many orders of magnitude larger than memories. They include various types of magnetic devices, such as disks, tapes, and memory cards, optical devices such as CD or DVD drives, and electromagnetic devices such as mass memories. Almost all of these fall in the I/O category, although many may be input only, such as nonwritable CDs and DVDs or read-only memories (referred to as ROM). The properties of all these devices are dramatically different from RAM.

The development of new manufacturing technologies that enable large, low-power-consumption, solid-state memories, and the parallel development of novel, high-capacity batteries, is creating a shift in the structure of computer systems. The new solid-state memories are slowly replacing the traditional, magnetic-memory-based, mechanically powered disks and the optically based CD and DVD memory devices. As of 2012, tablets, mobile devices, and even laptop computers have no mechanical components, and thus no disk, DVD, or CD devices; all such devices are replaced by solid-state large memories. There are, however, external disk, CD, and DVD drives that can be connected to these new computing devices, thus providing both a transition path and backup capabilities for the computing devices. These drives are powered through the computer system itself (via their data connection interface—currently the USB); therefore, they do not require power connections of their own.

Communication devices are typically I/O devices that connect the computer to a network, be it local in a room, or global. These may be without a physical connecting device (wireless, line-of-sight microwave, light of various spectrum, sound wave activator, or sensor) or wired (copper cable, fiber optics, or sound conduit).

The peripheral devices are controlled by the I/O part of the control unit and require quite a sophisticated set of software programs to make them useful. The reader is referred to any basic book about operating systems to complement her or his knowledge of this subject. For a list of suggested reading, see Appendix A.

2.3 Scales of Magnitude

Mass storage devices and memories are very large and thus measured with units that have names different from those used in everyday life. While we use the colloquial word grand to refer to $1,000, for amounts greater than $1,000 we use the names of the decimal system, such as million. These are not universally used—in the United States, one thousand million is called billion; in Europe it is called milliard. There is, however, an agreed upon nomenclature for powers of 10 so that one thousand is called kilo, one million is called Mega, and so on (see Table 2-1). Note the lowercase in kilo, the uppercase in Mega, and all that follow. This comes from the fact that the letter K is reserved, in the decimal nomenclature, for the designation of the absolute temperature measure (degrees in Kelvin).[1]

Table 2-1. Scales of magnitude


Actual size (bytes)

Other names

Real-world quantities

Megabyte (MB)


Million, 106

The King James version of the Bible contains approximately 5 million characters.

Mebibyte (MiB)



The speed of light is 300 million meters/second.

Gigabyte (GB)


Billion, 109

At 5% interest, $1 billion would return $50,000,000/year.

Gibibyte (GiB)



A billion $1 bills, end to end, would wrap the Earth at the equator 4.5 times.

Terabyte (TB)


Trillion, 1012

The U.S. GDP for 2006 was $13 trillion.

Tebibyte (TiB)



Global GDP in 2006 was estimated by the World Bank to be $46 trillion.

Petabyte (PB)


Quadrillion, 1015

108 × 1015 meters is the distance to the nearest star (excluding the sun), Alpha Centauri.

Pebibyte (PiB)


Large multinational enterprises and massive scientific databases are in this neighborhood of storage.

Exabyte (EB)



The oceans on the Earth contain about 326 quintillion gallons of water.

Exbibyte (EiB)


Zettabyte (ZB)



Zebibyte (ZiB)


The computer is not based on the radix 10; it is based on the radix 2. Inasmuch as 210 equals 1,024, which is close to 103, it became customary in the past to refer to 210 as kilo. Thus, one kilobyte was approximately one thousand bytes, and the discrepancy was small. When we move from a kilobyte to a megabyte, which now stands for 220 bytes, the discrepancy between 106 and 220 is significant, as 106 = 1,000,000 and 220 = 1,048,576. This is not a small difference and cannot be ignored. Obviously, as we move toward larger scales, the discrepancy in sizes expressed as decimal names for binary-based quantities is increased, causing confusion and inconsistency in reporting sizes.

For that reason, as of 2005, there is a standard that introduces new names for quantities expressed as powers of 2 and retains the familiar names for quantities expressed as powers of 10. Table 2-1 has names, sizes, and observations about the real meaning of the sizes, starting with megabyte for the decimal meaning of the size in bytes and mebibytes for the binary meaning. As of the time of this writing (2013), sizes of mass storage devices are usually quoted in the decimal meanings, and sizes of RAM are quoted in the binary meaning, both using the decimal nomenclature. This confusion, well exploited in advertising, will hopefully disappear as the binary nomenclature becomes better used, or if the community will decide to report correctly when decimal nomenclature is used.

Please refer to Table 2-1 to make sense of what you just read. The binary prefixes were first proposed by the IEC (International Electrotechnical Commission) in January 1999 and expanded in 2005 to include all binary equivalents to the accepted decimal prefixes. All binary prefixes and names were codified by the IEEE (Institute of Electrical and Electronics Engineers) as a standard in 2005 (IEEE 1541-2002).

2.4 Instruction Execution—Speed and Timing Scales

As explained earlier, programs operate by the control unit causing the central processing unit to execute instructions in a logically sequential manner. It is immaterial how many instructions are in a different execution phase at any point in time; their effect is transmitted in a serial fashion, one instruction at a time.

Instructions are executed in phases that take time, each controlled by a timing mechanism called a clock. In reality, there may be several clocks, but suffice it to say that clocks operate in various frequencies that define the basic time step of the instruction execution phases. Clock speeds are measured in Hertz (Hz), where 1 Hz is one clock cycle per second.

The scales of time and frequency are summarized in Table 2-2. It is important to realize the meaning of the scales represented there.

Modern computers (in 2013) operate on a clock that is typically somewhere between 2 GHz and 1 THz. The way that clock speed translates into instructions executed per second is not trivial and depends heavily on the design and cost of the computer. Again, that is not the topic of this book. Here we just state that while there is a link between the clock speed and the instruction execution rate, it should not be inferred that computer A with a clock rate double that of computer B will perform at twice the speed of B. The complication arises partially from overlap between phases of instruction execution and from the fact that different instructions typically take a different number of clock steps to execute.

To get a better handle on computer speeds, we measure them by the instruction rate rather than the time each instruction takes. These ratings are sometimes expressed in MIPS (Millions of Instructions Per Second) or FLOPS (FLoating-point Operations Per Second), and by multiples of these units such as MegaFLOPS or TeraFLOPS. To determine computer speeds, specially crafted programs are run and their execution times are recorded. This measures speed more realistically than simply using the processor’s clock speed.

Table 2-2. Scales of time and frequency


Fraction of a second


Real-world quantities




The speed of light is 300 million meters/sec.







A high-speed disk rotates once in 10 msec.







A typical laptop performs about 8,000 basic instructions in about one microsecond (Intel Core 2 Duo).







Light traverses only 30 cm in one nanosecond.




An instruction on a computer is done in several nanoseconds.










As supercomputers increase in size and speed, the complexity of problems solved by them increases to a point that the access of data in memory dominates the speed of execution. This necessitates new approaches to speed evaluations. As an example, one of the newer approaches (in 2011) introduced a measure called gigateps, a billion traversed edges per second, based on the speed of solving an analysis of the connections, or edges, between points in a graph.

Timing considerations are important not only for instruction execution, but also for the operation of peripheral devices and communication devices. These considerations, as with the previous ones relating to instruction speed, are beyond the scope of this book. Suffice it to say that the rotational speed of disks, measured in microseconds, is many orders of magnitude slower than the execution rate of instructions. Significant portions of operating systems are devoted to mitigate this difference so that the speed of execution will be minimally impacted by the slowness of the peripheral devices.

2.5 Bit Strings and Their Meaning

As discussed before, the contents of the memory consist of strings of bits. Most computers have these stored in individually addressable units of eight bits, called bytes. The bytes in turn can be strung together to form longer strings. For historical reasons, a group of two consecutive bytes is called a half word, four bytes (32 bits) are called a word, and 64 bytes are called a double or long word.

The meaning of the strings of bits is just that—a string of bits. The interpretation of the meaning, however, is dependent on the usage of the string. One important usage is to code an instruction and its parameters. There are many types of instructions: numerical, like add and multiply, logical, control, program flow, and others. Again, this book is not devoted to hardware details, so we do not elaborate. Simply said, a string of bits can be interpreted as an instruction, and given the address of the proper byte in this string, the control unit will try to decode and execute that instruction. The instruction itself will cause the control unit to find the next instruction, and so on.

Bit strings also can represent data. Here we have a wide variety of possibilities, so we restrict ourselves to the most prevalent data coding.

The simplest one is the integer. In this interpretation, the bit string represents a positive numerical value in radix 2. This means that each string represents a binary number, where each digit is weighed by the proper power of two, starting from zero on the extreme right (end of the string) and proceeding to the left. Thus, the string 01011101 will have the meaning 1 × 20 + 0 × 21 + 1 × 22 + 1 × 23 + 1 × 24 + 0 × 25 + 1 × 26 + 0 × 27, where × is the multiplication sign. Evaluating this expression, the string 01011101 has the value of 1 + 0 + 4 + 8 + 16 + 0 + 64 + 0 or 93. We do not discuss here how negative values are represented, but they are available.

Integer values are limited by the length of their string representations. Ruby recognizes two types of integers: regular and long. We discuss those (and other numeric representations) and their properties in a forthcoming chapter.

To alleviate the limitation of size imposed on integers, a very important representation of data is available. It is called floating point or real. The former name is older and used primarily in discussing hardware concepts.

In this interpretation, numbers are represented in scientific form, that is, as x × 2y. Thus, part of the string is used to hold the value of x and part is used to hold the value of y. Both x and y are expressed by their binary values, derived in the same way that we presented in our discussion of integers, or in a complex form as negative values introduce additional difficulties. As you will see, there are different types of real numbers.

The last interpretation that we discuss is that of characters.

Character representation follows an international standard, codified under the name Unicode. The standard provides for a representation of both character and noncharacter-based texts (such as, for example, Far East languages) and for the representation of other items (such as, for example, mathematical symbols, control characters, etc.). The Unicode representation uses one to four bytes per item. The first 256 characters, digits, symbols, and codes are contained in one byte and are identical to the previous standard known as ASCII (American Standard for Character Information Interchange). Almost all English-based texts files belong to this category, so it is customary to state that characters are single bytes.

For in-depth information on this important topic, the voluminous Unicode standard description (currently more than 600 pages) contains tables, descriptions, rules, and explanations for dozens of different scripts, languages, symbols, and so on.

There is a difference between character representations and their meaning. For example, the character “9” is not the number 9. The number 9 is represented by the character “9.” This distinction will be very important in future chapters as we have input programs that read characters, but we wish to use them as numbers. In our former example, we have seen that the number 93 is stored as the string 01011101, but the character string “93” will be stored in a completely different way. To obtain the number 93 from the character string “93,” we need a process of conversion from the character to the numerical representation. Ruby provides such a process, as do all programming languages.

These are the most important but by no means the only types of interpretations of bit strings. Some others may represent different types of data, be they components of colors for the display, controls for various devices, amplitudes for sound presentation, and so on.

2.6 The Interpreter Process and Ruby

We now have covered the general concepts of computer systems embodied in a Von Neumann–style machine. We stated that programs and the data used by them reside in central memory, which can be replenished by various peripheral devices. We also stated that the memory stores its content in bits—binary digits consisting of 1’s and 0’s.


Programs in Ruby or any other programming language are strictly human-readable. However, a computer only understands only instructions that are encoded as a sequence of 1s and 0s (binary digits). Thus, we use another program called an interpreter (done one step at a time) or a compiler (done for all steps) that translates the English-like programming language to binary machine instruction codes.

In the following chapters we will introduce various algorithms or processes designed to solve problems. Among all possible ways to introduce and express the algorithms, we have chosen the Ruby programming language. This language, and other programming languages, express the algorithms via sequences of unambiguous sentences called statements. These are written using the Latin character set, digits in decimal notation, and special symbols, such as =, ,, >, and others. Clearly, these are not binary digits, so these are not programs that can be directly executed by a computer. What procedure is used to accept programs written in a language like Ruby and causes them to be performed or, as we say, executed, by a computer?

There are several methods to accomplish this task. We will dwell on two of these: compilation and interpretation. In the interpretation process, we will use two different approaches, so one can claim that we will introduce three methods.

To begin, we will assume that the program to be executed is contained in a file produced, say, by a word processor such as Microsoft Word or a similar one. As a matter of fact, in this book we will advocate using a word processor that is directly geared toward writing Ruby programs, as opposed to a general-purpose word processor.

It is important to bear in mind that this book does not intend to cover the areas of compilation and interpretation. All we do here is introduce the concepts so that the rest of this book will be better understood.

Compilation is a process that analyzes a program statement by statement, and for each statement produces the instructions that execute the algorithm steps implied by that statement. Once all the statements are analyzed and the instructions produced, the compilation process will create a file containing the executable code for that program. That file, in turn, is loaded into the memory to be executed.

The compilation process is performed by a program called a compiler. Simply put, a compiler translates instructions written in a given computer language, call it X, to a format that can execute on a particular computer type, call it Y. Examples of X include C++, Java, and Ruby. Examples of Y include Intel’s Core 2, Motorola’s 68060, and NEC’s V20. So formally, a compiler for language X for a computer Y is typically (but not always) a program that is written in instructions executable on Y and, while executing and residing in the memory of Y, accepts as input a file containing statements in the language X and producing a file containing instructions for execution on computer Y.

A modern computer system will typically have several compilers for several languages.

Interpretation is a horse of a different color. In this process, statements are analyzed one by one and executed as they are encountered. In the pure case of interpretation (there are variants not discussed here) no instructions for the computer are produced—only the effect of the statements is evident. There is, therefore, a program called an interpreter for language X (written for computer Y) that accepts, as input, statements in language X and executes them.

There are essentially two main ways to do interpretation, and both are supported by the Ruby interpreter. The first one is called the interactive mode. In this mode, the programmer is prompted by the interpreter to enter one statement at a time, and the interpreter executes it. It can be viewed as a glorified calculator. It is very useful for such tasks as short programs, concept evaluation, and experimenting with options. It also is a good way to check and see if a statement does what you think it will do. It is often a good idea to try something out in the interactive interpreter before you put it in a program.

The second mode is the so-called batch mode (the name has historical roots; do not worry about what it means). In this mode, the program is prepared the same way it is in compilation; it is prepared in its entirety and stored in a file. The file containing the program is used as the input to the interpreter that analyzes the file statement by statement and executes the statements one by one.

Ruby is an interpretive language. It is beyond the scope of this book to say more on this subject, but as you dive into the language, and in particular as you run programs, how it all works will become increasingly evident.

2.7 Summary

While algorithm design typically abstracts away the underlying target computer architecture, completely ignoring architecture in introductory computer science books unnecessarily limits the understanding of readers. Understanding computer architecture basics and accounting for such basics in the design of algorithms often reduces the running time of the algorithm on the target computer. Thus, in this chapter, the fundamental aspects of computer architecture were introduced. We described the basic components of a computer, the fundamentals of data representation, and various unit determinations.

2.7.1 Key Concepts

§ The Von Neumann model of computing is the prevalent model in the architecture (structure) of computers, the one followed in this book.

§ A computer consists of a single or several central processing unit(s), memory(ies), and a control unit.

§ Both instructions and data reside in the memory.

§ Instructions are followed in a sequential manner, with some instructions capable of causing changes in the sequence.

§ A computer system includes a computer and peripheral devices of various types.

§ Peripheral devices, sometimes called input/output devices, are divided into user/computer interface (including sensors), communication, and mass memory devices.

§ All data are stored in binary form, but the interpretation of those data depends on their usage.

§ Two means to execute instructions are compilation and interpretation.

2.7.2 Key Definitions

§ Central Processing Unit (CPU): The part of a computer that executes instructions.

§ Random Access Memory (RAM): The main memory of the computer (there are also RAM units available as peripheral devices). RAM contents can be modified.

§ Read-Only Memory (ROM): Memory whose contents cannot be modified by computer instructions.

§ Radix (base): The base of a positional system.

§ Integer: Interpretation of memory contents as a whole number of limited range.

§ Real (floating-point) number: Interpretation of memory contents as containing two parts, man (mantissa) and exp (exponent), so that the number is expressed asman × 2exp.

§ Character: Interpretation of memory contents as a literal (letter, number, symbol, etc.).

§ Compilation: Translation of instructions written in a given language to the language of instruction for the target machine.

§ Interpretation: Step-by-step execution of the specified instructions.

2.8 Exercises

1. Write 208 in binary and in ternary (base 3). Hint: what are the ternary digits?

2. The octal system (base 8) uses the digits 0 through 7. The representation of the letter A in the ASCII encoding scheme is 1000001 in binary. What is it in octal?

3. Color pictures are built of pixels, each represented by three bytes of information. Each byte represents the intensity of the primary colors red, green, and blue (or RGB values). How many gigabytes of storage are required for a 1028 × 1028–pixel color picture?

4. A communication device has the capacity to transfer one megabit of data per second. A 90-minute movie is recorded at 25 frames per second, each frame consisting of 720  × 600 pixels. How long would it take to transfer this movie across the previously described communication device? Would someone be able to stream the video over this communication device without experiencing jittery playback? Explain why or why not.