2. Built-in Data Types

Department of Computer ScienceDr. John S. Mallozzi2. Built-in Data TypesBasic data concepts, Python built-intypes, Arithmetic in Python, VariablesCopyright © 2010 by John S. Mallozzi

• Bit strings• Representing text• Representing other data• Representing numbers• Conversion2

• Each bit is either 0 or 1, so it can represent twoitems• A bit string is a “string of bits,” i.e., a list of bits,such as 11010101• Two bits can represent four items• There are four possibilities 00, 01, 10, 11• Three bits can represent eight items• Four bits can represent 16 items• Each additional bit doubles the number of items3

• You have to be able to represent any character(letter, digit, punctuation, special “signaling”character, such as “end of a line” or “tab”)• Different character sets have been designed• ASCII: an 8-bit character set, OK for English andsome graphic characters• Unicode: a 16-bit set for “all” languages. This setincludes all the ASCII characters• Exact representation doesn't matter – done bysoftware behind the scenes4

• Data such as sound (e.g., music), still pictures,or video, usually require many bits of storage• Example: each single pixel in a picture requiresstorage of brightness and color information• For data requiring too much storage,compression is used• Examples: MP3 format for music JPEG for pictures• In compressed file formats (including MP3 andJPEG), some information is lost5

• Numerals• Positional notation – base 10• Other bases• Digits• Names of numbers• Hexadecimal• Arithmetic• Other numbers6

• The symbol used to write a number is called anumeral• 23 is an Arabic numeral and XXIII is a Roman numeralfor the number we call “twenty-three”• Our standard (Arabic) numerals are positional: thebase 10 digits are use• Roman numerals are not positional – they do not usea base• Using a base makes possible algorithms fordoing computations such as multiplication anddivision• You learned algorithms for multiplying and dividingArabic numerals• Try them with Roman numerals!7

• Example: 25042 1000 = 20005 100 = 500+ 0 10 = 0+ 4 1 = 4• Each position has a multiplier times a power of 102 5 0 41000position100position10position1position8

• Base 10 probably arose because…• Other bases can be used – any positive integerwould work• But it would take some getting used to!• Each position will be a power of the base• Example: in base 13, positions from right to leftrepresent 1, 13, 13 2 = 169, 13 3 = 2197• Example• If the numeral 642 is interpreted in base 13, it means6 13 2 + 4 13 + 2 1• This is the number whose base 10 numeral is 1068• Base 2, or binary numerals, are best for digitalcomputation9

• You need as many digits as there are 1’s in thebase• Ten for base ten: 0,1,2,3,4,5,6,7,8,9• Eight for base eight: 0,1,2,3,4,5,6,7• Thirteen for base thirteen: 0,1,2,3,4,5,6,7,8,9 and threemore Customary to use letters: for base thirteen, A, B, and C• Example: to write the number whose decimalnumeral is 642 in base 13, we write 3A5• 3 13 2 + A 13 + 5 1 = 642 (10 is written A)10

• We are used to writing numbers in base 10• Even the names we use for numbers are basedon 10• “six hundred forty two”• This makes it hard to talk about other bases• Sometimes we write 642 10 and 642 13 for “thenumeral 642 interpreted in base ten” and “thenumeral 642 interpreted in base thirteen”• So 642 10 = 3A5 13 and 642 13 = 1068 10• Not for humans!11

• Decimal and binary are most important – the firstfor us, the second for the computer• When we want to look at the actual representationinside a computer, hexadecimal (base 16) is moreconvenient than binary• Hexadecimal notation allows us to replace each 4-bit bitstring by a single symbol, by writing thehexadecimal equivalent of each 4-bit string• Example: 001110110001 2 = 3B1 160011 1011 00013 B 112

• The algorithms you learned for arithmetic may beused no matter what the base• Example: addition algorithm Add column by column, from right to left. If answer has morethan one digit, carry to next column Base 10 example:1 12 6 5+ 4 7 7742▫ Base 2 example:1 1 11 1 1 1 0 1+ 1 1 0 1 1 01 11001113

• To represent a negative number, you can usethe first bit for a sign• This is almost what is done in most computers• Leftmost bit is for the sign (0 for +, 1 for -)• But two’s complement notation is used• (Details in homework)• For decimals, floating-point format is used: twonumbers are stored• Base• Mantissa• (Details later)14

• From another base to decimal is easy• What about the other way?• Here is an algorithm:• Set the dividend equal to the original numeral• Repeat as long as the dividend is not zero Divide the dividend by the new base Remember the remainder Set the dividend equal to the quotient• Form the numeral in the new base by writing theremainders in reverse order• …but we don't have to do this – or understandwhy it works• (except in homework!)15

• Convert 44 to base 2• Dividend = 44. Dividend not 0.• Divide by 2: quotient = 22, remainder = 0• Set dividend = 22. Dividend not 0, so repeat.• Divide by 2: quotient = 11, remainder = 0• Set dividend = 11. Dividend not 0, so repeat.• Divide by 2: quotient = 5, remainder = 1• Set dividend = 5. Dividend not 0, so repeat.• Divide by 2: quotient = 2, remainder = 1• Set dividend = 2. Dividend not 0, so repeat.• Divide by 2: quotient = 1, remainder = 0• Set dividend = 1. Dividend not 0, so repeat.• Divide by 2: quotient = 0, remainder = 1• Set dividend = 0. Dividend is 0, so stop.• Base 2 numeral is 10110016

• Different kinds of numbers• Python integer types• Range of decimal numbers• Floating-point numbers• Writing floating-point numbers• Python floating-point type• Other Python built-in types17

• Because numbers must be stored in a limitedamount of space, storage differs depending on• Whether the number is an integer or a decimal• How large – and in the case of decimals, how precise– we require our represented numbers to be• Computer arithmetic distinguishes among• Whole numbers (integers)• Floating-point numbers Decimal numbers, but not the same as fixed-point numbers• Python tries to minimize our need to worryabout these matters, but also to be efficient –which does require some involvement18

• For efficiency, Python uses a size for integers that ismost efficient for the Python implementation• This restricts the size of both positive and negative integers thatcan be stored• The largest integer in a typical implementation is 2,147,483,647,but you should not memorize this – it is not the same for everyimplementation, and, if you really need it, you can get its valuefor your implementation from the Python library, assys.maxint• However, Python allows us to avoid such concerns,using “unlimited” integer range• When we write an expression, we can simply ignore thelimitation• When Python writes a result larger than the maximum integerit stores, it appends L (for “long”) to the result, reminding usthat more storage, and time, is being used for this number19

• Scientists sometimes require numbers as small as0.00000000000000000000000000000000067(Planck’s Constant)• In other circumstances, a number like588,000,000,000,000,000.0(approximate size of Milky Way in miles) can beconsidered small• Number of decimal places needed tosimultaneously represent both these examples ismore than 50• Using floating-point numbers, both can berepresented using three decimal places21

• Idea is to trade accuracy for range• Python must remember three things about eachnumber• Sign (+ or -)• Significant digits Ignore leading and trailing zeros Examples: 67 for Planck’s Constant and 588 for size of MilkyWay• Exponent Number of places to move the decimal point starting fromthe position after the first significant digit, with signindicating direction to move (- for left, + for right) Examples: -34 for Planck’s Constant and +17 for size of MilkyWay22

• You have to express the three important parts:sign, significant digits, and exponent• Significant digits written with decimal afterfirst significant digit• This is called the mantissa of the number• Exponent written after the letter E• Examples• Planck’s Constant: 6.7E-34• Size of Milky Way: 5.88E17• Note: scientists write 6.7·10 -34 and 5.88·10 17 .This is scientific notation.• Easier to use E in typing, though – no superscripts23

• Python uses a storage scheme with very large range, soyou don’t have to worry about that• The covered range in a typical implementation is about -10 308 to+10 308• Although the range is large enough so you don’t haveto worry, the accuracy is limited to the extent that youdo have to be concerned• The number of significant digits in a typical implementation isabout 15• Business calculations, for example, may need to have greateraccuracy• Examples:24

• Boolean type for logic• Name of type is bool• Only values are True and False (note capitalization)• Python considers bool to be a numeric type, so nonsensearithmetic works (but should not be used)• String type for text• Surround text with either ' or " quotes – Pythongenerally uses ' for display• More on this type later• There are types we won’t look at – complex, forexample – but we’ll work with some others later25

• Differences• Addition, subtraction, and multiplication• Division• Powers• The print statement26

• Each answer must have a type (int, long, float,str, …)• Sometimes conversion is automatic, but not always• Some new symbols must be used, since standardarithmetical symbols aren’t available• Examples: 2 3, 2 3, 3 ÷ 4, ¾, 5• In Python, we use * for multiplication and / for division There are no built-in fractions but the library has a way to usethem – instead use decimals We’ll talk about how to deal with roots later27

• For these operations, arithmetic is as usual,except for the symbols used and somesize/accuracy issues• Examples2*3 + 5*6 gives the answer 3664/2 – 5*7 gives the answer -32.3 * 4.2 gives 9.6600000000000001(inaccuracy after 15 th decimal place)2147483645 + 10 gives 2147483655L(the largest int that can be stored is 2147483647, so thelast result is converted automatically to long)Note that you are not permitted to use commas28

• Even in ordinary arithmetic, integer division ofintegers differs from decimal division• Example: 11 ÷ 3 gives quotient 3 with remainder 2 forinteger division, or 3.666666666666666… (“forever”)for decimal (“long”) division• In Python, if we want an integer result we muststart with two integers: 11/3 gives 3• For the remainder, use the symbol % (weird!)11%3 gives 2• If at least one of the numbers is a decimal (typefloat), you get a decimal answer11.0/3 and 11/3.0 both give (note theinaccuracy of the answer) 3.666666666666666529

• In math, we use superscripts to indicateexponents• Example: 2 3 means 2×2×2, whose value is 8• To raise a number to a power in Python, use **• Example: 2**3 means 2*2*2, whose value is 8• As in mathematics, negative and decimalexponents can be used, with the same meaningas in mathematics• Example: Value of 2**-3 is 0.12530

• If you place print in front of a Python value, anattempt will be made to show you the result in a waythat is “prettier” and more expected• Idea is to show what a user would expect• But keep in mind that this has no effect on calculations32

• Basics• Naming variables• Assignment• Using the print statement• String concatenation• Incrementing a variable33

• We often have to remember some information forlater in a calculation• You may use a variable to refer to that information• You can use an identifier of your choosing to namethe variable (within some rules we will look at)• More precisely, a variable is something that pointsto the information you want to remember• The difference is irrelevant for numbers, but willsometimes be important• Called a “variable” because you can change theinformation to which it points• Example: If the variable is called age, on your birthdayyou can change which number it refers to34

• The name of a variable must be a single “word”• For example, age but not current age• Case matters• For example, Python considers age and Age to be different• You can use case differences to combine multiplewords into one, satisfying Python and aidingreadability• Example: currentAge• The underscore symbol _ counts as a “letter,” soyou can use it between words to make the variablename look like two words while satisfying Python• Example: current_age35

• To associate a name with a value, we use an assignment:variable = value• Examples:granddaughter = 'Jessica'age = 10• Assignment defines the relationship (binding) between avariable and the value to which it refers• Without a definition, use of a name is considered anerror• A new assignment changes or redefines the reference36

• The print statement distinguishes strings (text)from variable names because the former appear inquotes• If you want to print several different items, usecommas to separate the items• Example:37

• Using a + sign between strings causes the strings tobe concatenated (chained together)• Unlike the case with print, you must put in blankspaces yourself• You can also concatenate numbers to strings, butyou must tell Python to treat the number as astring, using str38

• We often want to increase the value of a variableby some amount, e.g., 1• More precisely, we want to replace the value to which avariable refers by one more than that value• This is called incrementing the variable• We do this by using an odd-looking assignmentstatement, in which the variable appears on bothsides of =• Examples:39

• Incrementing can also be specified using someshorthand notation• Example: instead ofx = x + 2you can writex += 2• A similar notation works for subtraction and otheroperations• Example: x -= 2 for x = x - 2• These shorthand notations are purely optional –use them if you want to• But you should be able to recognize them if someoneelse writes them40

2. Built-in Data Types

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?