CS 3330: Lecture 3: Data

This page does not represent the most current semester of this course; it is present merely as an archive.

Announcements

Reading

I targeted both the reading to consume both the reading and HW time since there was not HW this past week (6-10 hours); how did I do?

Quiz

Early results suggest 72% as the mean score. We may revisit this.

Lab tomorrow, HW next day

A set of binary number puzzles in C

In lab, group work is encouraged but not required. We'll grade mostly for effort.

There's a “Beat the Prof” contest if you care (not associated with grade)

HW next day will have exact same instructions as the lab except

Do it on your own (no groups)
Different bits.c (i.e., different puzzles)

TAs

This version of this course has not been taught here before, and not all our TAs have English as their first language. If you have any trouble with TAs (either communicating with them or them understanding the material), please let me know in person or by email (and if possible identify the TA(s) in question) so we can resolve the issue.

We are finalizing office hours now. They should be posted later this week.

Topics

Bit Vectors and Binary

Wires, Transistors, and Bits

Transistors care about high vs low voltage. We usually write that as 1 (high) and 0 (low).

Given n wires, there are 2ⁿ possible combinations of 0 and 1. Usually we use those cases to represent numbers in binary. In binary, we call each 0 or 1 a “bit” instead of a “digit”.

Binary to Decimal

A number like 3102 is presented in place values. It means 3 × thousand + 1 × hundred + 0 × ten + 2 × one.

The places are powers of ten: 3 × ten³ + 1 × ten² + 0 × ten¹ + 2 × ten⁰

In binary we use two, not ten, and the biggest digit is 1 not 9. So binary 11010 means 1 × two⁴ + 1 × two³ + 0 × two² + 1 × two¹ + 0 × two⁰

To convert binary to decimal, write out those twoⁿ in decimal: 1 × 16 + 1 × 8 + 0 × 4 + 1 × 2 + 0 × 1 = 26

Hexidecimal

It is easy to get lost looking at a number like 11011110101011011011111011101111. The usual way to help in base 10 that is by grouping three digits with commas. In binary we group four bits instead. Then since there is 16 possible values for a group of 4 bits we write each group as a single symbol: 0, 1, 2, 3, …, 8, 9, a, b, …, e, and f. These are in order:

bits	decimal	hexidecmial
0000	0	0
0001	1	1
0010	2	2
0011	3	3
0100	4	4
0101	5	5
0110	6	6
0111	7	7
1000	8	8
1001	9	9
1010	10	a
1011	11	b
1100	12	c
1101	13	d
1110	14	e
1111	15	f

We usually write hexidemcial with a 0x in front.

Thus, 11011110101011011011111011101111
is 1101=d 1110=e 1010=a 1101=d 1011=b 1110=e 1110=e 1111=f
is 0xdeadbeef

Binary Math

Addition

Consider a simple decimal addition problem

( 11  carry)
  399
+   1
-----
  400

Why the 0s and the carries? because 9 is the biggest digit available

Binary: the same as decimal except that 1 is the biggest digit available

(1111 1  carry)
  111101
+  10101
---------
 1010010

Subtraction

Consider simple subtraction

( 11111  borrow)
  300100
-    201
--------
  299899

Why the 8 in the 100s place?

it was initially 1 - 2,
-1 from the borrow from the 10s place
+10 from the borrow from the 1000s place
= 8

Binary: let 1 be the biggest digit available

( 11111  borrow)
  100000
-    101
---------
   11011

Why the 0 in the 4s place?

it was initially 0 - 1
-1 from the borrow from the 2s place
+2 from the borrow from the 8s place
= 0

Multiplication

Consider how we do it in decimal

   12308
 x  1001
 -------
(  1     carry)
   12308
  00000
 00000
12308
--------
12320308

The only part that changes in binary is the addition

     11101
   x  1001
   -------
(11111    carry)
     11101
    00000
   00000
  11101
  --------
 100000101

2's Compliment

What is 0 − 1 in binary? We'll use the normal subtract-with-borrow, remembering that there are always more hidden 0s in front of any number.

(...111111111  borrow)
  ...000000000
-            1
--------------
  ...111111111

How about 0 &minus 13?

(...111111111  borrow)
  ...000000000
-         1101
--------------
  ...111110011

From this we see that 0 - x = ~x + 1. If we use this rule then math “just works”—we can ignore sign in our arithmetic.

How do we decide if a number is positive?

For unsigned types like unsigned, it is always positive and the borrow-of-the-end phenomenon is called “overflow”, usually meaning that something went wrong.
For signed types like int, if the high-order bit is a 1 then the number is negative.

Bit shifts

See § 2.3.7 for how bit shifting is like multiplying or dividing by a power of 2.

To make this work, in (most versions of) C >> will “sign-extend” signed types: if the high-order bit was a 1 the new bits added to the left side as also 1s, if it was a 0 the new bits are also 0s.

Floating-point

Three parts

A number like 31.4e-1 has three parts: a sign (+), a significand or mantissa (31.4), and an exponent (-1).

In binary, we encode

the sign as a single bit (0 is positive, 1 is negative, just like they are for 2's-compliment numbers)
the exponent is an unsigned binary number stored as the exponent plus an offset or bias
the mantissa as the fractional part of the number shifted to “1.whatever”. We don't record the “1.” (we don't need to, it's always “1.”) and the shifting may change the exponent.

`float` and `double`

The type float has 1 sign bit, 8 exponent bits, and 23 fraction bits, with a bias of 127 (0x7f).

The type double has 1 sign bit, 11 exponent bits, and 52 fraction bits, with a bias of 1023 (0x3ff).

1e-1 = 0.0001100110011 and so on repeating. Placing this in 1.whatever form we have an exponent of −4 and a mantissa of 1.100110011...

As a float that's

0 01111011 10011001100110011001100

As a double it's

0 01111111011 1001100110011001100110011001100110011001100110011001

Non-normal values

If the exponent is all 0s or all 1s then it has special meaning.

For all 0s, the number is “denormalized”: we no longer assume the leading “1.” in the fraction (we use “0.” instead) and we use an exponent of 1 − bias instead of 0 − bias.

For all 1s, the number is either infinity (if the fraction is all 0s) or a Not a Number (NaN) value (if there is at least one set bit in the fraction).

Done with floating-point

For most of this semester we'll ignore floating-point numbers.