Useful formulas for use in MS Excel
1. 2. 3. 4.
Dividing two numbers to get the product in pounds, shillings and pence Separating two strings bounded by a comma Using conditional statements Counting within a list delimited by sequential numbers
1. Dividing two numbers to get the product in pounds, shillings and pence A
B
C
D
1
Persons
Houses
RV
2 St. Marylebone
161860
16357
1053748
3 St. Pancras
198788
21852
925872
4 St. George's, Hanover Square
87771
10437
1076722
5 Islington
155341
20704
777632
E
F
G
To get the rateable value (RV) per person in pounds and shillings and pence, use the following three formulae in cells E2, F2, and G2: E2: =SUM(INT(D2/B2)) F2: =INT(SUM((D2/B2)-E2)*240/12) G2: =INT(SUM(((D2/B2)-E2-(F2/240*12)))*240) The formula for cell E2 gives the product of the division between the rateable value rounded down (using the function INT) to the nearest whole number. The formula in cell F2 takes the remainder of the previous sum, converts it to pennies, by multiplying by 240 and then divides by 12 to get shillings. Again an integer is given, rounding the whole number down. The formula in G2 repeats the calculations in F2 leaving a fraction of a pound which when multiplied by 240 gives the number of pence.
2. Separating two strings bounded by a comma A
B
C
NAME 1 Davies, Peter 2 Davis, Hezekiah
To separate the two elements of this variable use the following formulae in cells B1 and C1: B1: =LEFT(A1, FIND(",",A1)-1) C1: =RIGHT(A1,(LEN(A1)-FIND(",",A1)-1)) The LEFT and RIGHT functions are used to truncate a string, starting either from the left of that string or the right. The form is LEFT(string, number) where string is the cell you are interested in and number is the number of characters you want to take from that string. Thus =LEFT(shillings, 1) returns “s”. The second function used here is FIND. This function “finds” the character given in its first argument in the string given in its second argument. Thus =FIND(s, pounds) will return the number 6. Note that the FIND function only returns the character position of the first occurrence of that character. The first of these formulae then first finds the position of the comma in the string. (In the first case this is position 7. Then it truncates, from the left, a total of 6 characters (7 minus 1). This is necessary here because we don’t want the comma to appear in our result. The second of these formulae first finds the total length of the string, using the LEN function. Then finds the position of the comma. The latter number is subtracted from the first finding the length of the string after the comma. Because there is a space after the comma in both examples we remove one fewer character. The resulting number is used as part of the RIGHT function which takes the number of characters in its second argument from the right hand side of the first argument.
3. Using conditional statements Excel allows one to perform rudimentary conditional statements. Consider the census enumerators’ books where originally data relating to sex was denoted by being placed in the age columns: A
B
1
Frederick
2
Florence
3
Violet
4 5
Skipper
C
D
E
F
AGE_M
AGE_F
Head Wife
43
Skipper
Daughter
12
Ethel
Skipper
Daughter
11
Michael
Skipper
Son
4
Here we know that Frederick Skipper is male because his age is recorded in the Male column. Suppose that we wanted to add a column which gave the sex of each of these people, we would use the IF function. In cell G1 we would use the following formula: =IF(E1>0,"M","F"). This is saying that if the contents of the cell E1 (which is the male age) is greater than 0 then this person is a male and return the string “M”, otherwise (i.e., if the contents of E1 is less than 0 or missing) then return the string “F”. This is not entirely foolproof, because if the contents of the column for AGE_M contains anything at all, like a dash, it will return “M”. The basic syntax of the IF statement can be summarised as =if(condition is fulfilled, something, otherwise something else). IF statements are particularly useful because they can be nested. Nesting a condition, allows one to ask many (up to seven) conditions within a single statement. In the following example let us suppose that we want to classify the seven districts in Cornwall according to their population density. In our classification we want to have three groups, those districts which have a population density of less than one person per acre; those which have a population density of between one and five persons per acre and third, those which have a density of over five persons per acre. A
B
C
D
Acres
Population Density
1
Bomin MB
3312
5514
1.66
2
Bude-Stratton
4294
4459
1.04
3
Camborne-Redruth
22062
34920
1.58
4
Falmouth
1880
15040
8.00
5
Fowey MB
2979
2255
0.76
6
Helston MB
4014
4939
1.23
7
Launceston MB
2182
4273
1.96
=IF(D4<1,"GROUP1",(IF(D4<5,"GROUP2", "GROUP3")))
Consider the statement in stages: =IF(D4<1,”GROUP1”, “ANOTHER GROUP”) allows us to use two groups. Either the density is less than one or it isn’t. If it is the formula will return “GROUP1” otherwise it will return “ANOTHER GROUP”. So rather than ask Excel to return the string “ANOTHER GROUP” for all those districts which have a density of 1 or greater, we can nest a further IF statement. On its own this nested statement would look like this: =IF(D4<5, “GROUP2”, “GROUP3”). Here if the density is less than 5 Excel will return GROUP2 otherwise “GROUP3”. Putting the two together is like saying: if the contents of cell D4 is less than 1 return “GROUP1”, otherwise, first check to see whether its also less than 5, if it is then return “GROUP2”, and then for all others return “GROUP3”.
4. Counting within a list delimited by sequential numbers This series of formulae show how one can add the total number of people in a household to a spreadsheet based on census enumerators' books which only contains schedule numbers. The method can be employed for counting other lists of material which are delimited by sequential numbers. A B
C
D
2 1 MARY
GREAVES
HEAD
3
JAMES
DO
SON
4
JAMES
WARD
SERV
5
ANN
CHESHIRE
DO
6 2 ISAAC
PICKTON
HEAD
7
ELLEN
DO
WIFE
8
MARY ANN
DO
NEICE
9
CATHERINE
DO
DO
10
ELIZA
DO
DO
11 3 MARY
DUCKWORTH
HEAD
12
THOMAS
DO
SON
13
ELIZABETH
DO
DAUR IN LAW
14
MARY ANN
DO
GRANDDAUR
15
JOSEPH
ROYLE?
LODGER
16
THOMAS?
DO
DO
17
BETTY
LOWTON
DO
18
JOHN
FOULKES?
DO
19
JANE
DO
DO
20
FANNY
GARDY?
DO
21 4 JOSEPH
WATTS
HEAD
22
CATHARINE
DO
WIFE
23
FANY
DO
DAUR
24
CAROLINE
DO
DO
25
ELLEN
WILKINSON
SERV
26 5 LYDIA
FURNIVAL?
HEAD
27
MARY A
DO
NEICE
28
MARY G
DO
GRAND DAUR
29 6 WILLIAM
BEAUMONT
HEAD
30 7 ROBERT
DOBSON
HEAD
31
ELIZABETH
DO
WIFE
32
MARGARET
DO
DAUR
33
JANE
FLETCHER
STEP DAUR
E
F
G
H
1
Column A contains the schedule number; columns B and C contain the personal name and column D contains the relationship to household head. Note that the first row of this spreadsheet is empty.
Step one: Allocating a household number to each individual First, ensure that the first row of the table is empty. Type into cell E2 =IF(A2>=1,A2,E1) and then copy down to the end of the column. This checks whether the "schedule" number in column A is present or not. If it is present, then that number is put into column E; if it is not then the number directly above is inserted. Step two: Allocating a sequential household number In cell F2 insert the following formula: =IF(E2-E1=1,1,F1+1). This adds a sequential personal identifier for each member of the household. Step three: removing household numbers Insert =IF(F2>=F3,F2,0) into cell G2 and copy down to the end of the table. This allocates the total number of people in the household to the last person only, all others get the value of 0. Step four: putting the highest household number to all records in the household Putting the formula =IF(F2=G2,G2,H3) into cell H2. This adds the total number of people in each household to all the household members. Note: If using OpenOffice Calc replace the commas in the if statements with semicolons.