Several months into being an ECCE associate comes amidst synchrony with being several months into a Masters thesis. In my particular case, utilization of Esri software has increased geometrically. From tandem use of ArcGIS Pro 2.x and ArcCatalog 10.x (still feel more comfortable with the dedicated app for spatial-file management), to Portal integrations, to Story Maps, I have engaged with the newest lineup of Esri’s GIS products.
When it comes to my particular flavor of ‘practical application’, my research involves utilizing collected GPS tracks to examine the relationships between SLOTH – Sleep, Leisure, Occupation, Transport, Home – locations and the objective activity spaces subjects interact with for health outcomes. As an unabashed logical positivist, the highly objective nature of GPS data is a godsend. For matching interaction with space, GPS is considered to be highly objective, thus reducing subjective biases like recall, daily mobility, egoistic; let’s ignore researcher bias for now.
GPS units provide a bevy of useful data, not least of which records various accuracy/precision factors regarding the signals from the heavenly fleet of satellites. Once collected from eager research participants, the GPS data is typically provided in CSV format. Therein lies the trick: such data has to be converted to proper, readable spatial data for analysis in a GIS like ArcGIS Pro. Let me show you readers a particular conundrum that arose during my conversions.
From Raw to Processed
Once I had brought in the GPS data into ArcGIS Pro (following quick smoothing in Excel), the biggest challenge was creating new fields for analysis, based on the inherent GPS output. A research baseline had to be established, and that meant I needed fields for:
- Transport type – walking, cycling, driving –
based on speed thresholds - Type of day, weekday or weekend
- A flag for being in school or out of school (for child subjects), based on a school’s specific instruction time
Since I was dealing with hundreds of thousands of GPS points, my first instinct was to apply logic to the raw data. Off I went to the Field Calculator to calculate the values for these new fields. The logic applied? If-Else statements, based on Python syntax.
Because I was dealing with dates and times, an intimate understanding of the datetime Python module was necessary. I did not come to this dance with that level of understanding. The intricacies of the datetime module were immediately apparent, and I quickly ran into difficulties with specific datetime syntax. One element of experience I have gathered over the years of IT work is to not beat my head against the wall, at least not on my own. What I was working with is a few lines of code, no more than fifteen in total. Thus, I tried a handful of adjustments, and when I was still unsatisfied with the calculated output, I phoned a friend. In this case, the friend was the GeoNet community.
Altruism via Logic
With the aid of the GeoNet community, we ironed out my syntax errors and smoothed over the output needed for each derived field. GeoNet responses were lightning-fast and thoroughly detailed. One experienced user in particular stepped me through cascading issues with three-step If-Else statements.
Specific school instruction times were determined via an ID field, identifying which student went to which school. The ID field was the first level of the If-Else statement, followed by greater-than-less-than logic which was run on string-extracted 24-hour time values. Essentially, if the ID = SchoolX, then the value of the new field (e.g., being in or out of school) was determined by seeing if the GPS point timestamp was greater than or less than the bell-to-bell instruction time.
I find simple logical calculations for new fields fairly intuitive inside the ArcGIS Pro Field Calculator. My preference is for Python, a preference borne purely of experience and the language’s ties to GIS. However, field calculations can be performed with VBScript syntax, as well as the snazzy, relatively new Arcade language. I hope to tease out the benefits of Arcade in future years.
A Fulfilling Experience
I’m moving ahead with my research at a breakneck pace. With that pace comes the frenetic dive into other aspects of Esri software. Most notably in my case, the drive has landed into the pool called Forest CART, using a modelling tool new to ArcGIS Pro since version 2.2. With the aid of Education and Research staff at Esri Canada, we are currently figuring out the intricacies of processing millions of GPS points into predictive models; the first phase has unexpectedly involved determination of what processing hardware fits best with a Big Data GPS dataset. The GeoAnalytics Server may enter the fray, but that is a discussion for a future blog.
Ultimately, the takeaway messages I wanted to convey with this post is that programming modules can be tricky, (e.g., datetime), especially for the inexperienced programmer. Nevertheless, the experienced GeoNet community is ever-present, there to help smooth out issues. Coding derived values is a necessity when dealing with thousands to millions of GPS points, and that necessity is paramount due to the typical output of date & time values needing much post-processing to get into analyzable formats. ArcGIS Pro continues to make it more intuitive to tackle such issues. Truly a further step toward consigning ArcMap to the history books ^_^.