Point Pattern Analysis In An Arcgis Environment

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Point Pattern Analysis In An Arcgis Environment as PDF for free.

More details

  • Words: 3,426
  • Pages: 17
Point Pattern Analysis in an ArcGIS Environment

Jared Aldstadt and Arthur Getis San Diego State University

For a number of years in the Departments of Geography at San Diego State University and the University of Illinois, we have been working on the development of software for a variety of ESDA routines, especially in the field of spatial association. These software programs make it possible to easily apply such analytical devices as nearest neighbor, general pattern, and local statistics. We have called our routine Point Pattern Analysis (PPA), but the actual product is constantly in flux as we attempt to make the approach as user friendly and comprehensive as we can. Those who have worked on the programs over the years include Marc Armstrong (Iowa), Laura Hungerford (Illinois), DongMei Chen (Queens), Lauren Scott (ESRI), and the authors. Currently we are adapting some of the materials of PPA for use in an ArcGIS environment. We use Visual Basic for Applications (VBA) to create a visual representation of several PPA routines. We are now in a position to demonstrate the use of certain local statistics in a fully functioning, uncoupled system. In this paper we demonstrate our approach on the local G statistic (Ord and Getis 1995). We should point out that one of our recent endeavors is to create a user-friendly approach to the new O statistic (Ord and Getis 2001), a spatial statistic designed to describe local clustering in a non-stationary setting. To better understand what it is that we are attempting to do, we call attention to our on-line software materials, which represent an earlier approach to 14 different ESDA tools with documentation. It can be seen at: http://xerxes.sph.umich.edu:2000/cgi-bin/cgi-tcl-examples/generic/ppa/ppa.cgi. Building on that work, Lauren Scott recently included in an ArcView 3.2 environment several of these routines. We used these successfully in exercises assigned for social science students and young faculty in

CSISS workshops on pattern analysis the last two summers. Now, we are moving further along by translating these routines into the ArcGIS environment. Thus far PPA has been used in about a dozen published research projects. In addition, SpaceStat, the well-known spatial econometrics package, and SAGE, a British ESDA package, have a few of the PPA statistics included. Just recently, ClusterSeer, from BioMedware, the health-related software company, has adopted our approach for certain cluster analyses.

ArcGIS ArcGIS is the latest version of Environmental Research Systems Institute's (ESRI) suite of GIS products. ArcGIS is designed as a scalable system for geographic data creation, management, and analysis. ESRI products have a large user base. The ESRI website states that there are over 1 million ESRI software users worldwide, and that 50,000 university students receive instruction utilizing ESRI products every year (ESRI 2002). In previous versions, ESRI's desktop GIS, ArcView, and its enterprise level GIS, ArcInfo, were very different in terms of the primary geographic data model and user interface. In ArcGIS, however, all of the products use the same data model, GUI interface, and development environment (Limp 2001). In addition to ArcView and ArcInfo, there is a medium sized version of ArcGIS called ArcEditor. Each variety of ArcGIS consists of several individual applications that provide a set of functionality. ArcMap is a primary component of all three versions and is the interface for data display and analysis. An analysis tool developed for ArcMap can be used in all versions of ArcGIS.

ArcObjects and VBA The development platform for the ArcGIS applications is called ArcObjects. ArcObjects is built using Microsoft's Component Object Model (COM) and can be embedded and extended using any COM compliant language. Visual Basic and Visual C++ are COM compliant languages that can be used to extend existing ArcGIS applications or write stand-alone applications that utilize ArcObjects components. This method of development generally requires that the developer first becomes a competent Visual Basic or Visual C++ developer in order to exploit ArcObjects.

An easier way to extend ArcGIS is to use VBA, which is embedded in the ArcMap application. VBA is a scripting language that allows the developer to use the existing framework of ArcMap and add functionality through the creation of custom tools, menus, and modules (Zeller 2001). The primary advantage of using VBA to develop spatial analysis tools for ArcMap is ease of development. The integrated development environment (IDE) allows the programmer to design, code, and debug within a common environment. Building GUI interfaces is simplified with a drag and drop control toolbox. Easy integration with common Windows features such as context help, Windows help files, and file access dialogs allow custom tools to have the same look and feel as larger applications. As with all scripting languages, the primary drawback of VBA is slow execution compared to compiled code. This may be especially important in developing computationally intensive statistical analysis tools.

PPA version 1.0 The package is written in the C programming language and includes a variety of pattern analysis routines. There are both DOS and UNIX versions of the program. A screenshot of the DOS version of the program running in Windows is shown in Figure 1.

Figure 1: A screenshot of the PPA menu. There are several limitations that the PPA users face when analyzing their data. PPA accepts input as an ASCII file in three columns: x coordinate, y coordinate, and value at the point. The output is saved to

an ASCII file and displayed on the screen as text. Often, PPA users must convert their data files to ASCII before running PPA and again convert their files to a GIS compatible format for display after analysis. The interface of PPA is text menu based. The lack of a GUI interface and features such as file dialogs create frustration among users used to windowing environments. Integrating PPA routines into a GIS environment would eliminate these drawbacks and allow for truly integrated exploratory spatial data analysis (ESDA).

The following routines are available in PPA version 1.0: •

Nearest Neighbor Analysis – A test of significance on the observed mean nearest neighbor distance in a point pattern



Refined Nearest Neighbor Analysis – A Monte Carlo test comparing the complete distribution function of the observed nearest neighbor distances.



K-function Analysis – Also called second order analysis, a point pattern test based on inter-event distances and a Monte Carlo significance test.



Local K-function – A local test based on K-function analysis that examines the clustering of neighbors around each point individually.



Weighted K-function analysis – A statistical test based on the permutation of the weights at each point.



Knox space time clustering – A method to analyze the clustering of events in both space and time based on the Poisson distribution.



Join count statistic – A simple measure of spatial autocorrelation for data measured at the nominal scale.



Moran’s I – A measure of global spatial autocorrelation produced by standardizing the spatial autocovariance by the variance of the data.



Geary’s c - A measure of global spatial autocorrelation that uses the sum of the squared differences between pairs of data values as its measure of covariation.



Getis and Ord’s General G - A multiplicative measure of overall spatial association of values that fall within a given distance of each other.



Local Moran’s I – Local Autocorrelation statistic based on Moran’s I.



Local Getis and Ord G-statistics – These statistics evaluate the extent to which a location is surrounded by a cluster of high or low values.

PPA in ArcGIS We have selected the local Getis and Ord G-statistics as a test case for implementing PPA routines in ArcGIS. This custom tool is written as a VBA macro. The macro uses one form to receive user input, and an additional dialog box to query the user for an output field name. After computing the statistic for each observation, the result is saved in the specified field of the original dataset. The complete code listing of the local G macro is included in Appendix A, and an example of its use follows. The example will use the local G-statistic to analyze the cancer rates of New Mexico counties. The data are taken from the National Cancer Institute Biometry Research Group Datasets, and it is available on their website, http://dcp.nci.gov/bb/datasets.html. The total number of cases diagnosed between 1980 and 1989 were divided by the 1985 population estimate to determine the cancer rate (cases per 10,000 population) of each county. County seats are used as a proxy for population centroids. The data are in the shapefile format and are loaded into ArcMap. For reference, a polygon layer of the New Mexico counties is also displayed.

Figure 2: Running a macro in ArcMap

In order to run a macro, the user goes to the Tools menu and selects the Macros option. A popup menu will be displayed with two more options. Selecting the Visual Basic Editor option will bring up the VBA editor and allow the user to create, view, and modify macros. Selecting the Macros… option will display the macros available to the user in the current ArcMap document. The Macros dialog is shown in Figure 3.

Figure 3: The Macros Dialog In the Macros dialog box, the user selects the desired macro and clicks an action button along the right hand side. In this case the only macro available is the LocalG.LocalG macro. The name before the period indicates the code module that the macro is in, and the name after the period is the name of the subroutine to run. When the Run button is clicked the macro begins execution. The Local G Statistic dialog is displayed to prompt the user for input. The first combo box prompts the user to select a layer to analyze. This combo box will be populated with all the point layers in the active map. After selecting a point layer, the second combo box is populated with all the numeric fields available for the specified layer. In our example, the user selects the NM County Seats point layer and the CancerRate field. The text box labeled Distance is used to enter a distance for analysis. The distance units are the same as the units of the map, and in this case the units are meters. The checkbox labeled Star specifies

whether to include the value at the observation itself in the analysis at that point. For the example we have chosen a distance of 100,000 meters and we have checked the Star checkbox. After specifying the parameters of analysis the user presses the OK button to continue. At anytime the user may select the Cancel button to exit the macro.

Figure 4: The Local G Statistic parameter input dialog Next, the user is prompted for an output field name (Figure 5). A default name is supplied, but the user may change the name. For display purposes we have changed the field name to Gi*(100km) from the generated default field name, Gi*(100000).

Figure 5: Output Field dialog box After the user selects an output field name, the specified G-statistic is calculated for each observation and stored in the attribute table of the layer being analyzed. The attribute table for our example

dataset is shown in Figure 6. The existing data and our newly created field, the column on the right hand side of the table labeled Gi*(100km), can be viewed together immediately after execution of the macro.

Figure 6: Attribute Table for NM County Seats dataset The existing functionality of ArcMap can then be used to display the results graphically using charts, graphs, and maps. These graphical outputs can be dynamically linked to the data table to further aid ESDA. The map in Figure 7 shows the local G-statistic values calculated above. Figure 8 is a more

complex map that displays both the cancer rates and the G-statistic value at five different distances for each county.

Figure 7: The local Gi* statistic value for 100km calculated from cancer rates of New Mexico counties

Looking Ahead The use of VBA macros to create spatial data analysis tools in ArcMap is relatively easy to learn and quick to implement. Leveraging the existing file manipulation, data interfaces, and visualization functionality of ArcMap allows the programmer to concentrate on developing spatial analysis methods. There are many existing features of ArcObjects that we have not yet implemented. The integration of help files and the utilization of the graphing objects are planned additions. Still unexplored is the ability of ArcObjects to create new layers and graphics to aid in the visualization of data and results. We hope to show distance qualities in analysis by stages. This will provide an outstanding tool for those wanting to learn about the relevant spatial statistics.

In the case of computationally intensive methods, however, the slow execution of VBA will not suffice. To perform analysis that use Monte Carlo simulations, such as K-function analysis, or involve a large numbers of computations, like the O statistic, requires compiled software components. These components will demand a more intensive development effort. Although ESRI has a large user base, the expense of purchasing a distribution of ArcGIS makes these tools inaccessible to many institutions, companies, and individuals. It is important, therefore, to maintain existing free versions of PPA and work with other vendors and developers of open source spatial analysis tools.

Figure 8: Map of New Mexico Cancer Rates by County and Gi* values for five distances

References ESRI (2001). ArcObjects Developer Help, Environmental Research Systems Institute, Inc., Redlands, California ESRI (2002). Environmental Research Systems Institute, Inc., http://www.esri.com/ Limp, W. Frederick, (2001). Quick Take Reviews: ArcGIS 8.1, GeoWorld, National Cancer Institute Biometry Research Group Datasets http://dcp.nci.gov/bb/datasets.html Ord, J.K. and Getis, A., (1995). Local Spatial Autocorrelation Statistics: Distribution Issues and an Application, Geographical Analysis, 27(4): 286-306 Ord, J.K. and Getis, A., (2001). Testing for Local Spatial Autocorrelation in the Presence of Global Autocorrelation, Journal of Regional Science, 41(3): 411-432 Getis, Arthur, Chen, Dong Mei, and Aldstadt, Jared (2000). Point Pattern Analysis 1.0, http://xerxes.sph.umich.edu:2000/cgi-bin/cgi-tcl-examples/generic/ppa/ppa.cgi Zeller, Michael, (2001). Exploring ArcObjects, Environmental Research Systems Institute, Inc., Redlands, California

Appendix A: VBA Code for Local G Macro This appendix is a listing of the code for the macro module that calculates the local G statistic. There are global variable declarations, two module subroutines (LocalG and getFields), and three form event subroutines. Global Variable Declarations Public Public Public Public

star As Boolean gDist As Double valueField As String layerName As String

LocalG Subroutine Sub LocalG() 'This macro computes the local G statistics for point feature classes 'in ArcMap 'Set document and Map objects Dim pMxDoc As IMxDocument, pEnumFeat As IEnumFeature Dim pGeom As IGeometry Dim pMap As IMap Set pMxDoc = ThisDocument Set pMap = pMxDoc.FocusMap Dim pFeatureLayer As IFeatureLayer Dim pILayer As ILayer Dim pFeatureClass As IFeatureClass 'Set a UID for GeoFeatureLayers Dim pId As New UID pId = "{E156D7E5-22AF-11D3-9F99-00C04F6BC78E}" 'IGeoFeatureLayer 'Create an enumerator of GeoFeatureLayers Dim pEnumLayer As IEnumLayer Set pEnumLayer = pMap.Layers(pId) 'Load the input form Load frmLocalG 'Populate the layer combo box with point layers pEnumLayer.Reset Set pILayer = pEnumLayer.Next Dim pointExists As Boolean pointExists = False Do While Not pILayer Is Nothing Set pFeatureLayer = pILayer Set pFeatureClass = pFeatureLayer.FeatureClass If pFeatureClass.ShapeType = esriGeometryPoint Then frmLocalG.cmbLayer.AddItem pILayer.Name pointExists = True End If Set pILayer = pEnumLayer.Next Loop 'If there are no point layers in the map, send a message and exit If pointExists = False Then

MsgBox "This routine requires a point layer" Exit Sub End If 'Show the form frmLocalG.Show 'If cancel button is selected, exit subroutine If layerName = "" Then Exit Sub End If 'Set the layer to be analyzed pEnumLayer.Reset Set pILayer = pEnumLayer.Next Do While Not pILayer Is Nothing If pILayer.Name = layerName Then Exit Do End If Set pILayer = pEnumLayer.Next Loop Set pFeatureLayer = pILayer Set pFeatureClass = pFeatureLayer.FeatureClass 'Get the value field index Dim fieldIndex As Integer fieldIndex = pFeatureClass.FindField(valueField) 'find or create field for output 'generate the default name string Dim outFieldName As String Dim defaultName As String If star Then defaultName = "Gi*(" + Str(gDist) + ")" Else defaultName = "Gi(" + Str(gDist) + ")" End If 'Give default name in InputBox for output field name outFieldName = InputBox("Please enter an output field name.", _ "Output Field", defaultName) 'look for existing field Dim outFieldIndex As Integer outFieldIndex = pFeatureClass.FindField(Left(outFieldName, 10)) 'if field does not exist, create it If outFieldIndex < 0 Then Dim pOutField As IField Dim pOutFieldEdit As IFieldEdit Dim pFields As IFields Dim pFieldsEdit As IFieldsEdit Set pOutField = New Field Set pOutFieldEdit = pOutField Set pFields = pFeatureClass.Fields Set pFieldsEdit = pFields With pOutFieldEdit .Editable = True .IsNullable = False

.Name = Left(outFieldName, 10) .AliasName = outFieldName .Type = esriFieldTypeDouble .Length = 16 .Precision = 15 .Scale = 4 End With pFeatureClass.AddField pOutField Set pOutField = Nothing outFieldIndex = pFeatureClass.FindField(Left(outFieldName, 10)) End If 'create feature cursor to edit the table Dim pFeatureBuffer As IFeatureBuffer Dim pFeatureCursor As IFeatureCursor Set pFeatureCursor = pFeatureClass.Search(Nothing, False) Set pFeatureBuffer = pFeatureClass.CreateFeatureBuffer Dim icount As Integer icount = pFeatureClass.FeatureCount(Nothing) ReDim dataset(icount, 4) As Double 'array used for calculations Dim counter As Integer Dim sumval As Double, sumsqr As Double sumval = 0 sumsqr = 0 Dim pFeature As IFeature Dim nvalue As Double 'read data into array, get coordinates For counter = 0 To (icount - 1) Set pFeature = pFeatureClass.GetFeature(counter) nvalue = pFeature.Value(fieldIndex) Set pGeometry = pFeature.Shape Dim pPoint As IPoint Set pPoint = pFeature.Shape dataset(counter, 0) = pPoint.x dataset(counter, 1) = pPoint.Y dataset(counter, 2) = nvalue sumval = sumval + nvalue sumsqr = sumsqr + (nvalue * nvalue) Next 'Calculate statistic and store values in the table D = gDist If star Then For counter = 0 To (icount - 1) Set pFeature = pFeatureClass.GetFeature(counter) Dim s As Double, xbar As Double, sumw As Double, sumwx As Double xbar = sumval / icount s = Sqr((sumsqr / icount) - (xbar ^ 2)) sumw = 0 sumwx = 0 For jcounter = 0 To (icount - 1) Distance = Sqr(((dataset(counter, 0) - _

dataset(jcounter, 0)) ^ 2) + _ (((dataset(counter, 1) - _ dataset(jcounter, 1)) ^ 2))) If Distance <= D Then sumwx = sumwx + dataset(jcounter, 2) sumw = sumw + 1 End If Next dataset(counter, 3) = (sumwx - (xbar * sumw)) / _ (s * Sqr(((icount * sumw) - _ (sumw ^ 2)) / (icount - 1))) pFeature.Value(outFieldIndex) = Round(dataset(counter, 3), 4) pFeature.Store Next Else For counter = 0 To (icount - 1) Set pFeature = pFeatureClass.GetFeature(counter) Dim sn As Double, xbarn As Double Dim sumwn As Double, sumwxn As Double xbarn = (sumval - dataset(counter, 2)) / (icount - 1) sn = Sqr(((sumsqr - (dataset(counter, 2) ^ 2)) / _ (icount - 1)) - (xbarn ^ 2)) sumwn = 0 sumwxn = 0 For jcounter = 0 To (icount - 1) If jcounter <> counter Then Distance = Sqr(((dataset(counter, 0) - _ dataset(jcounter, 0)) ^ 2) + _ (((dataset(counter, 1) - _ dataset(jcounter, 1)) ^ 2))) If Distance <= D Then sumwxn = sumwxn + dataset(jcounter, 2) sumwn = sumwn + 1 End If End If Next If sumwxn = 0 Then dataset(counter, 3) = Else dataset(counter, 3) = / End If

0 (sumwxn - (xbarn * sumwn)) _ (sn * Sqr(((icount * sumwn) _ (sumwn ^ 2)) / (icount - 1)))

pFeature.Value(outFieldIndex) = Round(dataset(counter, 3), 4) pFeature.Store Next End If 'Star End Sub

getFields Subroutine Public Sub getFields(layerName As String) 'This subroutine populates the fields combo box in the frmLocalG when 'a point layer is selected. Dim pMxDoc As IMxDocument, pEnumFeat As IEnumFeature Dim pGeom As IGeometry Dim pMap As IMap Set pMxDoc = ThisDocument Set pMap = pMxDoc.FocusMap Dim Dim Dim Dim Dim

pFeatureLayer As IFeatureLayer pILayer As ILayer pFeatureClass As IFeatureClass pFields As IFields pfield As IField

Dim pEnumLayer As IEnumLayer Set pEnumLayer = pMap.Layers() pEnumLayer.Reset Set pILayer = pEnumLayer.Next Do While Not pILayer Is Nothing If pILayer.Name = layerName Then Exit Do End If Set pILayer = pEnumLayer.Next Loop Set pFeatureLayer = pILayer Set pFeatureClass = pFeatureLayer.FeatureClass Set pFields = pFeatureClass.Fields For i = 0 To (pFields.FieldCount - 1) Set pfield = pFields.Field(i) If pfield.Type = esriFieldTypeDouble Or esriFieldTypeInteger Or _ esriFieldTypeSingle Or _ esriFieldTypeSmallInteger Then frmLocalG.cmbField.AddItem pfield.Name End If Next i End Sub

Form Event Subroutines Private Sub btnCancel_Click() layerName = "" Unload frmLocalG End Sub Private Sub btnOK_Click() If CDbl(txtDistance.Value) = 0 And chkStar.Value = False Then MsgBox "Gi cannot be performed for distance = 0." Exit Sub End If layerName = cmbLayer.Value valueField = cmbField.Value gDist = CDbl(txtDistance.Value) star = chkStar.Value Unload frmLocalG End Sub Private Sub cmbLayer_Change() getFields (cmbLayer.Value) cmbField.Enabled = True End Sub

Related Documents