DATA ORGANIZATION
Data definition “Data refers to a collection of organized information, usually the result of experience, observation or experiment, other information within a computer system, or a set of premises. This may consist of numbers, words, or images, particularly as measurements or observations of a set of variables.” Wikipedia
outline Need to organize data Commonly used terms Concepts of GIS data organization How to organize data in ArcGIS? Demo Conclusion/Summary
NEED
to organize data
A form that allows both spatial and attribute data to be quickly retrieved for updating, querying and analysis
Organizing data for analysis GIS software mainly organizes data in layers. Layers are thematically defined, based on a project requirement. Attribute and spatial data may be entered at different times and may be required to be linked together. 5. Spatial Data 6. Non-spatial data
Commonly used terminology
Layer – “The visual representation of a geographic dataset in any digital environment ….” – ESRI In ArcGIS it refers to a data source such as shapefile, raster, geodatabase feature class and any representation that is attached with it.
Table – A set of data stored in rows and columns. Columns refer to fields. Rows and columns intersect to form cells. Relationship – An association or link that exists between two objects in a database. Relationships can exist between spatial objects (features), between nonspatial objects (rows in a table), or between spatial and nonspatial objects. Features– Geometrical shapes often expressed as vectors. Often expression of a real world object on a map. Attribute Data – Tabular or textual data describing the geographic characteristics of features. Spatial Reference – The coordinate system used to store a spatial dataset. Feature Class – A collection of features with the same geography type. Raster Catalog – A collection of raster datasets. It is often used to display adjacent rasters
SPATIAL DATA LAYERS
Vertical Data Organization
Data is separated by various themes and overlaid based on analytical requirements. Most commonly used to organize data.
SPATIAL DATA LAYERS contd…
Proper identification of layers is critical. This involves: Identifying information products. Identifying data requirements for information products. Prioritizing data requirements and products. Determining GIS functional requirements.
SPATIAL DATA LAYERS contd…
Spatial indexing – Horizontal Data Organization
The proprietary organization of data layers in a horizontal manner is called spatial indexing. Often this is in form of grids for easier querying and retrieval.
SPATIAL DATA LAYERS contd…
The horizontal indexing of spatial data involves: The use of a librarian subsystem to organize data for users The requirement for a formal definition of layers. The need for feature coding with themes of layers. Requirements to maintain data integrity through transaction of selected tiles.
Organizing data in arcgis Use ArcCatalog to preview and organize spatial and non-spatial data. Use a geodatabase to act as a central repository.
A geodatabase is a container that stores a collection of datasets. Types: 1. File Geodatabases (.gdb) 2. Personal Geodatabases (.mdb) 3. Enterprise - ArcSDE Geodatabases
COMPONENTS OF A GEODATABASE
Geographic Datasets Feature Dataset – vector based Raster Dataset – raster (pixel) based TIN Dataset – triangulation based
Tables Feature classes
Simple feature classes Topological feature classes
Relationship Classes
INSIDE A GEODATABASE Tables
Feature Dataset Spatial reference
Feature classes Polygon Line Point
Annotation Route
Relationship Classes
Raster Datasets
Raster Catalogs
Toolboxes
Tools Topology Geometric Networks Network Datasets
Behavior
Scripts Models
Attribute defaults Attribute domains Split/Merge policy Connectivity rules Relationship rules
CHOOSING BETWEEN DIFFERENT TYPES OF GEODATABASES
File Geodatabase
Stores datasets in folders or files Size Limit – 1 Terrabyte (optionally can be increased) Operating system – cross platforms Can be encrypted for read only Good default choice
Personal Geodatabase Stores its datasets in a Microsoft Access .mdb file on disk Size Limit – between 250 - 500 Megabyte Operating system – Windows only
ArcSDE Geodatabase Stores datasets in DBMSs like Oracle, Oracle with Locator or Spatial, SQL Server, DB2, Informix, PostgresSQL Allows multiuser editing, version control, archiving
CREATING A GEODATABASE
Migrate existing databases to geodatabase Use UML and CASE tools subsystem of ArcGIS to generate the schema of your geodatabase
MODELING GEODATABASES
Case Tools implement UML Use Design Software like Visio, Rational Rose to build UML Exported to an intermediate format: Microsoft Repository or an XML Metadata Interchange (XMI) file Use the Schema Wizard in ArcCatalog to create the schema in your geodatabase from your UML model Use available Templates to develop models
IMPORTING DATA TO A GEODATABSE
Converting data to the Geodatabase Shapefiles Coverages Geodatabases CAD files (AutoCAD and MicroStation)
Data can be easily converted to geodatabase feature classes Multiple methods Context Menus in ArcCatalog and ArcMap Geoprocessing Tools Copy/Paste in ArcMap Drag and Drop to ArcSDE
DATA IMPORT TOOLS
Feature class to Feature class tool is most versatile, allows different data sources There are specific tools for Annotation There are specific tools for CAD Data Load Data Tool can be added to ArcMap via Customize dialog
demo
Create a Geodatabase. Create a dataset. Define spatial coordinate. Import shapefiles in to the dataset. Import tables
Conclusion (steps taken to organize data) Data Organization is tailored to the research question you are trying to answer. Identify the different pieces of the puzzle. How are the different components of data tied in? Have you documented the:
Data? Process?