2 Pre-processing

Before you start an estimation location there are a number of steps you can undertake to maximize the quality of your results.

2.1 Data quality

Invariably the quality of your location estimates will depend on the quality of your input (logger) data. This means that poor quality data (due to false twilights or nest visits during the day) will negatively affect a location estimate’s accuracy. To remove the most common sources of error the stk_screen_twl() function is included. Eliminating poor quality days will improve location estimates as there is a temporal dependency between the current estimate and the previous one.

2.2 Data frequency

Unlike a purely twilight based approach to location estimates the {skytrackr} package uses all, or part of, the measured diurnal light cycle. By default only twilight data is used. However, in some cases it might be adventageous to use the full diurnal cycle by adjusting the range parameter to include more data. Including more data will increase the computational power, i.e. time, required for a good estimate. It is also important to note that some loggers (e.g. those by the Swiss ornithological society) do not register a full diurnal profile. Always inspect a daily light profile to establish if a full diurnal cycle is recorded, and exclude any baseline and saturated values (i.e. fill values).

2.3 Optimization iterations

There is also a trade-off between the amount of data used in a location estimate and the number of iterations used during optimization. If your data quality, and or frequency, is low it is adviced to increase the number of optimization iterations. For high quality data 3000 iterations generally yields good results, but increasing this number to 6000 might provide a more robust estimate in some cases. It is adviced to inspect the performance of the routine on a single logger, before proceeding to (batch) process all data. Iteration values in excess of 10K should generally not be required.

2.4 Step-selection dynamics

The step-selection function constrains the validity of a proposed location estimate. However, the function used is approximate only. It must also be noted that while an individual might move a long distance across a day (in absolute sense), its position from day-to-day might not move much (e.g. in the most extreme case, there is no day-to-day movement if the individual returns to a nesting location). The step-selection function should reflect short-distance ranging movements (i.e. rapid decay) rather than long-distance migration movements.

3 Post-processing

After estimating locations you can inspect the location estimation data using the stk_map() function. This will give you an initial idea on the accuracy of the estimates. In particular, values of the sky conditions parameter should not be skewed dramatically to higher values. The latter suggests that the true parameter might be out of bounds. Dependent estimates of latitude and longitude will therefore be wrongly estimated.

Additionally, exploring uncertainty metrics such as the spread of the uncertainty on both longitude and latitude parameters helps determine the quality of the estimated locations. For most optimizer specified a Gelman-Rubin Diagnostic (or grd value) is returned in the data output. Gelman-Rubin Diagnostic values < 1.05 are generally considered showing covergence of the parameter (location) estimates.

Optimization tips

achieving robust location estimates

Koen Hufkens