Lab 3: Finding correlations
Housekeeping:
Make sure StarOffice is installed on your account. If it is not:
-
At the UNIX Prompt type: cd /auto/staroffice/program
-
At the UNIX Prompt type: setup
-
Accept all defaults
Verify that you have turned on the Spatial Analyst and GeoProcessing
extensions:
-
File --> Extensions (Checked extensions are activated.)
Review from last week:
In the last lab we got to this point:
We have created a polygon coverage indicating the shoreline of the Jupiter
Inlet. To utilize this in removing the problem areas from the Loxlattice
data we only need the central polygon. Save this polygon as a separate
file by making the Shoreline coverage active then:
-
Select the central polygon (selected polygon will appear
yellow) Selection tool looks like this:

-
Theme --> Convert to Shapefile
-
Name the file (I used Mask.shp), Make sure it is saved in
the correct directory
Next we need to convert Loxlattice, which is a continuous raster format,
to discrete depth categories.
Highlight Loxlattice in the table of contents
A window like this should appear:
-
Click on the Classify button (upper left)
This window appears:
Change the number of classes to 8, hit OK, Hit OK in the Reclassify
Values window
Reclass of Loxlattice should appear in the table of contents.
When you turn it on, it should look something like this:
To make this into a theme we can work with, it needs to be a shapefile:
-
Theme --> Convert to Shapefile
-
Name the theme (I used Depth.shp), making sure the file
is saved in the correct directory
It should now look like this:
Now we will clip out all the land area so we will only have the water
area to use in calculations. Earlier we created a theme called Mask.shp
which shows only the water area. This is what we will use to clip
Depth.shp, using the Geoprocessing Wizard.
-
View --> GeoProcessing Wizard
-
Select "Clip one theme based on another" , hit next
-
Make the input theme Depth.shp
-
Make the polygon overlay theme Mask.shp
-
Make sure the output file is specified to the correct
directory and name the output file (I used Depthclip.shp)
Hit Finish, it will take a few minutes to create the new shapefile.
We now have a shapefile which tells us water depths which we can use
for calculations.
Calculating a Chi-square statistic: Finding correlation between
two polygon themes
We want to know if there is a correlation between the depth of the water
and the soil type found at the bottom of the inlet. To determine
this we will use a Chi-square statistic. We will use a coverage showing
soil types called bottomtype.
Add in bottomtype theme:
-
View --> Add Theme (Or click on the Add
Theme button on the tool bar:
)
The theme will be found at: /home/sanduku/classes/5930/bottomtype
Turn off all the themes except Depthclip.shp and Bottomtype
Clip Bottomtype using Depthclip.shp:
-
View --> GeoProcessing Wizard
-
Select "Clip one theme based on another" , hit next
-
Make the input theme Bottomtype
-
Make the polygon overlay theme Depthclip.shp
-
Specify to the correct directory and name the output
file (I used Clipbottom.shp), hit finish
There are two ways we can compare the depth information with soil types,
visually and numerically. To compare visually, double click on Depthclip.shp
in the table of contents to bring up the Legend Editor. Change the
Legend Type to Graduated Color, change the Classification Field to Gridcode,
hit apply. Next, double click on Clipbottom.shp, make the Legend
Type: Unique Value, and the Values Field: Bottom. Make the symbols
appear as patterns, rather than solid colors, by clicking on this icon:
.
To make the patterns colorful, use this icon:
.
Hit apply.
Your view should now look something like this:
What does this comparison show you?
Visually it is difficult to determine if there is any correlation between
the depth information and soil types. So, we will numerically calculate
a Chi-square statistic to determine the correlation.
Create a crosstabulation:
-
Analysis --> Tabulate Areas
-
Row Theme = Depthclip.shp
-
Row Field = Gridcode
-
Column Theme = Clipbottom.shp
-
Column field = Bottom
Hit OK, accept defaults
Your result is a table that looks like this:
What does this table tell you?
To be able to work with the table, we need to be able to open it in
StarOffice. With the crosstabulation table open:
Save as dbase file, in your working directory, as crosstab1.dbf
Open StarOffice
-
File --> Open (browse to directory where you saved
the file)
You now have a table that looks like this:
These values show the area in square feet. Because these values
are so large, it will be easier to work with area in acres.
1 acre = 43560 square feet
In cell B14 enter the formula: =B2/43560 hit enter.
Highlight cell B14,
Highlight the cells from B14 to F21 (click in B14, hold down right mouse
button and drag to F21), highlighted cells should appear black.
You should now have the values for area in acres, next we need row
and column totals.
Highlight cells B22 to F22, hit the sum button:
,
on the tool bar. The column totals should be calculated.
Highlight cells G14 to G22, hit the sum button. The row totals
should now be calculated.
Your new table should look like this:
The value in the lower righthand corner is the total area, in acres,
that you are working with.
This table shows you the observed values of soil type area per depth
class. (Remember that your soil types are the columns and your depth
classes are the rows.)
For a Chi-square calculation, we need to know expected values, as well
as, observed values. We will calculate the expected values using
the row and column totals.
Highlight the entire acre table, B14 to G22,
Move to cell B25,
Under selection, uncheck Paste all, check Numbers, uncheck everything else.
Hit OK.
This way you are pasting the numbers rather than the formulas.
Highlight cells B25 to F32 (everything except row and column totals),
Go to cell B25(which should now be empty) and enter this formula:
=$G25*B$33/$G$33 This will take the row total multiplied by the column
total divided by the grand total to give you your expected values.
Copy this formula to fill the cells B25 to F32. (Highlight cell B25,
Edit --> Copy, Highlight cells B25 to F32, Edit --> Paste)
You will remember from class that the Chi-square statistic is calculated
by this formula:
((Observed - Expected)^2)/Expected
Go to cell B36, enter this formula: =((B14-B25)^2)/B25
Highlight cell B36,
Highlight cells B36 to F43,
Sum the rows and columns.
The number in the lower righthand corner is your Chi-square statistic.
Using the information you learned in class, or using your textbook,
analyze the Chi-square statistic you calculated. An explanation of
Chi-square begins on page 171 in your text.
Is there a correlation between depth and soil type and water depth?