Introduction

In the Fall semester of 2023, as part of the “Introduction to Programming in Geographic Information” course, a comprehensive project was undertaken: the creation of a Stratified Random Sampling Tool using Python within ArcGIS Pro. This tool was designed to aid soil scientists in selecting optimal sampling locations for a hypothetical research project focused on studying vegetation growth potential in the Rocky Mountains and foothills of western Alberta.

Project Overview

The objective of the project was to assess the variability in vegetation growth potential based on elevation. The researchers adopted a stratified random sampling design and required a tool to generate random sampling locations within their study region. These locations needed to be stratified by elevation zones and restricted to relatively flat terrain.

Project Criteria

The tool aimed to meet several criteria outlined by the researchers:

Tool Inputs

The tool had the following inputs, set as parameters in the script tool dialog:

  1. A digital elevation model (DEM) for the study region, provided by the user as a raster dataset.
  2. A study area polygon, selected by the user as a shapefile.
  3. The number of unique random points to be generated in each elevation zone, ensuring equal sampling.
  4. The name and location for the new point shapefile that the tool would create.

Tool Outputs

The primary output was a shapefile containing random point locations stratified by elevation and slope criteria. The elevation zone information was included as an attribute in the output shapefile’s attribute table, with a field.

In addition to the main output, the tool generated informative messages for the user, indicating the success or failure of the tool and providing details on the number of points generated. If the tool failed, it offered explanations, such as potential failure due to an excessive number of points requested by the user.

Development Phases of the Stratified Random Sampling Tool

Defining the problem:

In the initial Analysis phase, the project involved a thorough examination of the requirements outlined by soil scientists for the creation of a Stratified Random Sampling Tool. The problem statement encapsulated the complexities of studying vegetation growth potential in the Rocky Mountains and foothills of western Alberta. Elevation and slope played key roles in this project, forming the foundation for a comprehensive tool that would assist in selecting optimal sampling locations within specified elevation zones and relatively flat terrain.

Designing the solution to the problem:

In the subsequent Design phase, an algorithm was formulated to address the unique challenges presented by the project objectives. The focus was on the stratified-random sampling design, ensuring that the tool met specific criteria outlined by the researchers. The algorithm was designed to stratify random points across distinct elevation zones, adhering to user-defined equal sampling requirements and imposing slope restrictions to secure relatively flat sampling sites. The user interface was planned, defining parameters. Figure 1 shows the logical solution (algorithm/workflow) for selecting appropriate points.

Figure 1. Random Point Selection Algorithm

The process begins by inputting elevation and study area data along with key variables. After projecting and clipping the datasets, a slope surface is derived, and both slope and elevation undergo classification. The reclassified slope surface is turned binary, producing distinct layers for elevation zones and slopes. These layers are converted to polygons, forming the basis for generating random sample points within specified elevation and slope criteria. The points are assigned zone numbers and consolidated into a single output layer.

Translating the algorithm into a programming language:

The Code phase involved the translation of the designed algorithm into Python, utilizing the capabilities of the ArcPy package within the ArcGIS Pro environment. The code was transformed into an ArcGIS script tool, integrating within the GIS workflow, and allowing users to leverage the tool within the familiar ArcGIS Pro environment. The coding process was facilitated in Visual Studio Code (VSCode), providing an efficient environment for syntax error capturing and enhancing the overall development experience.

Figure 2. Coding Stage in VSCode

Testing and debugging:

Moving on to the Debug and Test phase, rigorous testing procedures were implemented using diverse datasets representative of the study region. The tool underwent scrutiny to identify and rectify errors, ensuring its reliability and accuracy. Edge cases were particularly scrutinized to ensure the robustness and stability of the tool. Informative messages were incorporated to communicate tool success or failure, offering detailed explanations in case of failures such as excessive point requests by the user.

Documentation:

Finally, in the Complete Documentation phase, a user guide and comprehensive code documentation were created. The user guide elucidated the tool’s functionality, explaining each input parameter’s significance. The code documentation included comments providing insights into complex sections, fostering understanding for future users or developers. This documentation served to empower users and facilitate the tool’s integration into GIS workflows.

Creating a GUI for the Stratified Random Sampling Tool in ArcGIS Pro

In enhancing the user experience, the development extended to creating a Graphical User Interface (GUI) for the Stratified Random Sampling Tool within ArcGIS Pro. This GUI, implemented as a script tool, streamlines the interaction with the tool. Users can now easily input parameters, visualize the workflow, and execute the tool seamlessly through the intuitive interface. The integration of a GUI adds efficiency and accessibility, making the tool more user-friendly for researchers and GIS professionals alike.

Figure 3. Intuitive GUI for the Stratified Random Sampling Tool in ArcGIS Pro

Visualizing Stratified Sampling Results

Figure 4 illustrates the generated shapefile containing random point locations, stratified by elevation and slope criteria. The data-driven insights reflect the success of the tool in meeting the project’s criteria.

Figure 4. Tool Output

Conclusion

The project exemplified the practical application of Python programming within ArcGIS Pro, showcasing the symbiosis between programming skills and GIS expertise in addressing specific spatial analysis needs. It provides valuable insights into the real-world applications of scripting in the realm of geographic information projects.