mwlaxeref is an R package for going back and forth between different lake and waterbody identifiers such as: NHDHR+, NHD, LAGOS, and local state waterbody identification.
For the examples used in this vignette, we’ll use the following data from Wisconsin
head(wis_lakes, n = 3)
#> state county lake.id lake.name
#> 1 wi fond_du_lac 8900 forest_lake
#> 2 wi burnett 2638700 big_trade_lake
#> 3 wi washburn 2451300 bass_lake
Crosswalk functions are intuitive and easy to understand. For
example, to crosswalk these Wisconsin lake IDs to NHDHR+, use the code
below. The from_colname
must be specified so that the
function knows which column in wis_lakes
contains the local
IDs (the “lake.id” column in this case).
nhdhr_ids <- local_to_nhdhr(wis_lakes, from_colname = "lake.id", states = "wi")
head(nhdhr_ids, n = 3)
#> # A tibble: 3 × 5
#> state county lake.id nhdhr.id lake.name
#> <chr> <chr> <chr> <chr> <chr>
#> 1 wi fond_du_lac 8900 139268654 forest_lake
#> 2 wi burnett 2638700 91678049 big_trade_la…
#> 3 wi washburn 2451300 {AC03C0F2-2D44-4F50-8CF3-197E0EB7BF42} bass_lake
Similarly, NHDHR+ IDs can be converted to LAGOS.
nhdhr_ids <- nhdhr_ids[, "nhdhr.id"]
lagos_ids <- nhdhr_to_lagos(nhdhr_ids)
head(lagos_ids, n = 3)
#> # A tibble: 3 × 2
#> nhdhr.id lagos.id
#> <chr> <chr>
#> 1 139268654 4993
#> 2 91678049 4362
#> 3 {AC03C0F2-2D44-4F50-8CF3-197E0EB7BF42} 4554
There are 6 different lake identification fields, and back and forth
cross-walking functions exist for each of them. The six different ID
fields are the first six column names of the lake_id_xref
data.frame.
head(lake_id_xref, n = 3)
#> # A tibble: 3 × 9
#> nhdhr.id nhd.comid nhd.id lagos.id mglp.id local.id state agency id.field
#> <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 120017928 NA <NA> 139100 WI600079… 100 wi wisco… WBIC
#> 2 151959502 NA 13062951 119419 <NA> 100000 wi wisco… WBIC
#> 3 70331693 NA 13393553 100263 WI600009… 1000000 wi wisco… WBIC
Each state has its own shortcut function to each of the various other lake identifiers. For example, to go from Wisconsin local ID to LAGOS ID, you can use the following
lagos_id <- wi_to_lagos(wis_lakes, from_colname = "lake.id")
head(lagos_id, n = 3)
#> # A tibble: 3 × 5
#> state county lake.id lagos.id lake.name
#> <chr> <chr> <chr> <chr> <chr>
#> 1 wi fond_du_lac 8900 4993 forest_lake
#> 2 wi burnett 2638700 4362 big_trade_lake
#> 3 wi washburn 2451300 4554 bass_lake
In some cases states contain multiple unique identifiers. In other
cases there are multiple state agencies that each have their own unique
ID. These duplicate instances often have the same NHDHR+, LAGOS, or
other ID, so the agency
and id_field
arguments
have been implemented to allow you to specify which agencies’ unique ID
to use (or which unique identification field if multiple exist within
the same agency).
For example, in Michigan many lakes have both a UNIQUE ID and a NEW KEY field. Trying to go from NHDHR+ or LAGOS for these will yield duplicate results due to their being two IDs.
mi_nhdhr <- data.frame(nhdhr.id = "123397651")
nhdhr_to_mi(mi_nhdhr)
#> Warning in crosswalk_lake_id(data, from = "nhdhr", to = "local", from_colname = from_colname, :
#> Some of records in the output may be duplicated due to one-to-many relationships among lake identifiers.
#> i.e. newdat <- nhdhr_to_mi(mi_nhdhr)
#> nrow(newdat) > nrow(mi_nhdhr))
#> This likely means duplicated data. Proceed with caution.
#> Some states have multiple ID fields. Consider using the id_field argument
#> nhdhr.id local.id
#> 1 123397651 27-265
#> 2 123397651 13524
This duplication can be overcome by specifying the id_field argument as follows:
Different ID fields for certain states can be found in
lake_id_xref
under the column called id_field
.
For the example of Michigan, see
unique(lake_id_xref$id.field[lake_id_xref$state == "mi"])
.