Number Bases and Measurements of Internet Importance:
A Review of Base 10, Base 2, Base 16, and some
measurement terminology.

Computing & Information Services Department.
author: jim.cerny@unh.edu

http://www.unh.edu/NIS/Courses/Jargon/number-bases.html
updated 10-JUN-1999

Base 10, Base2, and Base 16.

While most of our discussions use the familiar
base 10, or "decimal" as we will sometimes call it,
there are times when we refer to base 2 ("binary")
and base 16 ("hexadecimal").  You should be able to
recognize the terminology for each base and if
given a base 2 or base 16 number to convert it to
base 10.

Let's concentrate on the first nine powers of two since 
these are numbers that often come up in discussions. 
(This document is intended to supplement a discussion 
in a class, to present the highlights, not to be a 
verbatim transcript.)  

Computers do internal arithmetic in base 2 or binary.  
Base 16 is used in various special discussions such as 
in Ethernet addresses or for colors in HTML tags.  In 
base 10 a recurring upper limit is 256 because that 
corresponds to a common unit of computer storage, the 
byte, and allows for specification of 256 different 
characters in a byte.  We often talk about bits, as in 
"bits per second" (also written as "bits/sec"), when 
talking about network speeds or capacity.  The word 
"bits" is just a shorthand for "binary digits".  A 
computer convention to be aware of is that sometimes 
the lowest number in a sequence is called "0" rather 
than "1"; if you do that with 256 values then they 
range 0-255 rather than 1-256.

Rather than try to write exponents as small 
superscripts, they are written here as "^n" where
the caret indicates an exponent and the "n" is
the value.

Base 10 uses the digits 0,1,2,3,4,5,6,7,8,9
Base  2 uses the digits 0,1
Base 16 uses the digits 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F

---------------------------------------------------------
    2^n        base 10           base 2         base 16
---------------------------------------------------------
    2^0              1         00000001               1
    2^1              2         00000010               2
    2^2              4         00000100               4
    2^3              8         00001000               8
    2^4             16         00010000              10
    2^5             32         00100000              20
    2^6             64         01000000              40
    2^7            128         10000000              80
   (2^8)-1         255         11111111              FF
---------------------------------------------------------

Let's work through the calculations for the last row,
expanding the calculation digit by digit using decimal
numbers.

2^8 = 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2

256 = 2x100 + 5x10 + 6x1

11111111 = 1x128 + 1x64 + 1x32 + 1x16 + 1x8 + 1x4 + 1x2 + 1x1

FF = 15x16 + 15x1

Measurement Terminology.

Prefixes for really big numbers and really small numbers:

prefix  exponent  units          number
------  --------  ----------     ---------------------
kilo        3     thousands                   1,000.
mega        6     millions                1,000,000.
giga(1)     9     billions            1,000,000,000.
tera       12     trillions       1,000,000,000,000.
peta       15     quadrillions
exa        18     quintillions
zetta      21     sextillions
yotta      24     septillions

milli      -3     one-thousandth      .001
micro      -6     one-millionth       .000001
nano       -9     one-billionth       .000000001
pico(2)   -12     one-trillionth      .000000000001
femto     -15     one-quadrillionth
atto      -18     one-quintillionth
zepto     -21     one-sextillionth
yocto     -24     one-septillionth

   (1) pronounced both "gig-ga" and "jig-ga"
   (2) pronounced both "pee-ko" and "pie-ko"

In abbreviating measurements, be sure to distinguish between
"MB" or "Mb" for "megabytes" and "Mbits/sec" for "megabits per second."
The simple quantity in megabytes is often used to describe
computer memory size or disk size.  The rate in megabits 
(or kilobits) is often  used in statements of transmission
speed or capacity.  In fact, the "mega" prefix in computer
measurement often refers to 1024 instead of 1000 of something,
reflecting the underlying relationship to the binary number
system and powers of two.

When we talk of "an order of magnitude" we are talking about
multiplication (or division) by 10.

When looking at graphs, be sure to study the axes.  Often 
numbers are plotted on logarithmic graph paper to make 
a curvilinear relationship look linear (straight line).  We
won't get into logs (logarithms) except to note this common
scientific usage.

Some Internet measurements are used metaphorically to
dramatize the speed of change on the Internet.  For example,
if we talk about a year on the Internet as equivalent to 
three months.  Or if we talk about the half-life of information
on the Internet (borrowing from the rate of decay of
radioactivity) to dramatize how fast information becomes out
of date about the Internet.  We might almost expect Yogi Berra
to impart some wisdom similar to his alleged remark that "it gets dark
early" in parts of the Yankee Stadium outfield!
  

Special data integrity numbers.

An important kind of number in computing is a computed value that is used to detect corruption of data. Implementations of this principle include the parity bit, checksum, cyclical redundancy check (CRC), and message digest. This typically involves computation of a check-digit added to an existing number, or computation of a summary number (digest), such that a recomputation will show either if the number is corrupted (in the case of a checksum) or if the message is corrupted (in the case of a message digest). A good starting point to understand the concept, is the design of the ISBN number for books.

In computing, MD5 is a powerful algorithm for calculating a message digest for some input text of indefinite length. The goal is to authenticate the contents of a message. At any point you can recalculate the message digest and compare it to the original calculation. If even a single character is changed then it should produce a big difference in the digest. The best algorithms, such as MD5, are selected because they are extremely resistant to spoofing (finding another message that will yield the same result). An essential requirement, of course, is to have a means of making the original message digest known and stored in a separate place from the message so that someone can't both change the message and the original message digest.

The MD5 message digest for this document was:

   4f9dd91c69614f38f4343bcc8e63844c
This was computed just before adding this paragraph, so it is already changed, but you can see the nature of the number that results -- you can also see that it is a hexadecimal number. The MD5 algorithm is described in RFC 1321 by Rivest.


Return to Internet Jargon and Usage course page.