Selecting Projects

Author

Steffi LaZerte

Published

July 4, 2024

Here we explore a list of open or semi open Motus projects to select more project ids which we can use in our pilot study.

This data TagSpeciesProject.xlsx was given to us by Birds Canada but in future should be accessibly directly through the motus package.

Setup

source("XX_setup.R")
library(readxl)

Cleaning and Filtering

Cleaning

The species names (English and scientific) listed here aren’t always consistent, so we’ll omit them and use only the species IDs to match them with the NatureCounts metadata from XX_setup.

Then we consolidate the deployments in those projects as some listed some deployments under one species name and others under another (even if the species was the same)

Filtering

  • keep only access == 1 which are fully public projects
  • omit species ID 129470 which are listed as attached to a Human…(?)
  • omit projects with non-Canadian species
p_sp <- read_excel("Data/01_Raw/TagsSpeciesProject.xlsx") |>
  rename_with(.cols = contains("No column"), \(x) "access") |>
  filter(access == 1,             # Only completely open projects
         speciesID != 129470) |>  # Don't worry about Human tags :D
  select(-speciesName, -motusEnglishName, -access) |> # Species Names are not consistent
  summarize(across(everything(), sum), .by = c("tagProjectID", "speciesID")) |>
  left_join(select(species_list, "species_id", "scientific_name", "english_name"),
            by = c("speciesID" = "species_id")) |>
  mutate(good = !is.na(scientific_name),
         other = str_detect(english_name, "Eurasian|European|Elaenia")) |>
  group_by(tagProjectID) |>
  filter(all(!other | is.na(other))) |> # Omit projects with non-Canadian species
  mutate(good_tags = sum(num_deployments[good]),
         prop_good_tags =  good_tags / sum(num_deployments)) |>
  ungroup() |>
  select(-other) |>
  arrange(desc(prop_good_tags))

gt(p_sp) |>
  fmt_number("prop_good_tags", decimals = 2) |>
  gt_theme() |>
  tab_options(container.height = px(600),
              container.overflow.y = "auto")
tagProjectID speciesID num_deployments scientific_name english_name good good_tags prop_good_tags
85 19250 112 Plectrophenax nivalis Snow Bunting TRUE 112 1.00
221 15580 2 Catharus ustulatus Swainson's Thrush TRUE 2 1.00
240 42218 39 Junco hyemalis Dark-eyed Junco TRUE 39 1.00
249 16610 53 Setophaga coronata Yellow-rumped Warbler TRUE 53 1.00
258 15600 10 Hylocichla mustelina Wood Thrush TRUE 20 1.00
258 16950 10 Parkesia motacilla Louisiana Waterthrush TRUE 20 1.00
259 18940 8 Ammospiza nelsoni Nelson's Sparrow TRUE 103 1.00
259 18950 39 Ammospiza caudacuta Saltmarsh Sparrow TRUE 103 1.00
259 18960 56 Ammospiza maritima Seaside Sparrow TRUE 103 1.00
306 15770 69 Turdus migratorius American Robin TRUE 71 1.00
306 15830 2 Ixoreus naevius Varied Thrush TRUE 71 1.00
308 15770 30 Turdus migratorius American Robin TRUE 30 1.00
310 14050 176 Progne subis Purple Martin TRUE 176 1.00
352 15580 75 Catharus ustulatus Swainson's Thrush TRUE 257 1.00
352 15900 30 Dumetella carolinensis Gray Catbird TRUE 257 1.00
352 18550 2 Pipilo maculatus Spotted Towhee TRUE 257 1.00
352 18830 10 Pooecetes gramineus Vesper Sparrow TRUE 257 1.00
352 18900 21 Ammodramus savannarum Grasshopper Sparrow TRUE 257 1.00
352 19450 23 Passerina amoena Lazuli Bunting TRUE 257 1.00
352 19610 31 Sturnella neglecta Western Meadowlark TRUE 257 1.00
352 19760 25 Molothrus ater Brown-headed Cowbird TRUE 257 1.00
352 20420 40 Spinus pinus Pine Siskin TRUE 257 1.00
360 14250 54 Hirundo rustica Barn Swallow TRUE 54 1.00
361 16920 11 Limnothlypis swainsonii Swainson's Warbler TRUE 11 1.00
363 16820 35 Setophaga striata Blackpoll Warbler TRUE 35 1.00
364 16820 1 Setophaga striata Blackpoll Warbler TRUE 55 1.00
364 20310 2 Pinicola enucleator Pine Grosbeak TRUE 55 1.00
364 20330 38 Haemorhous purpureus Purple Finch TRUE 55 1.00
364 20420 14 Spinus pinus Pine Siskin TRUE 55 1.00
370 15590 6 Catharus guttatus Hermit Thrush TRUE 6 1.00
371 36261 15 Locustella naevia Common Grasshopper Warbler TRUE 57 1.00
371 36271 22 Acrocephalus schoenobaenus Sedge Warbler TRUE 57 1.00
371 36475 20 Curruca communis Greater Whitethroat TRUE 57 1.00
377 16060 1 Toxostoma lecontei LeConte's Thrasher TRUE 8 1.00
377 16370 7 Phainopepla nitens Phainopepla TRUE 8 1.00
380 14250 44 Hirundo rustica Barn Swallow TRUE 84 1.00
380 35407 40 Riparia riparia Bank Swallow TRUE 84 1.00
387 40818 1 Melozone cabanisi Cabanis's Ground-Sparrow TRUE 1 1.00
388 15651 91 Turdus philomelos Song Thrush TRUE 91 1.00
391 16900 31 Protonotaria citrea Prothonotary Warbler TRUE 31 1.00
393 14040 24 Eremophila alpestris Horned Lark TRUE 130 1.00
393 16300 31 Anthus spragueii Sprague's Pipit TRUE 130 1.00
393 18910 14 Centronyx bairdii Baird's Sparrow TRUE 130 1.00
393 19130 5 Rhynchophanes mccownii Thick-billed Longspur TRUE 130 1.00
393 19160 56 Calcarius ornatus Chestnut-collared Longspur TRUE 130 1.00
408 18940 83 Ammospiza nelsoni Nelson's Sparrow TRUE 83 1.00
414 19500 137 Passerina ciris Painted Bunting TRUE 137 1.00
417 16930 81 Seiurus aurocapilla Ovenbird TRUE 226 1.00
417 18990 58 Melospiza melodia Song Sparrow TRUE 226 1.00
417 19030 63 Zonotrichia albicollis White-throated Sparrow TRUE 226 1.00
417 41384 24 Vireo olivaceus Red-eyed Vireo TRUE 226 1.00
424 14050 63 Progne subis Purple Martin TRUE 63 1.00
426 18830 36 Pooecetes gramineus Vesper Sparrow TRUE 37 1.00
426 18950 1 Ammospiza caudacuta Saltmarsh Sparrow TRUE 37 1.00
430 15590 2 Catharus guttatus Hermit Thrush TRUE 6 1.00
430 17310 4 Icteria virens Yellow-breasted Chat TRUE 6 1.00
438 36271 29 Acrocephalus schoenobaenus Sedge Warbler TRUE 91 1.00
438 36467 30 Sylvia borin Garden Warbler TRUE 91 1.00
438 36475 32 Curruca communis Greater Whitethroat TRUE 91 1.00
464 15550 1 Catharus fuscescens Veery TRUE 59 1.00
464 15580 28 Catharus ustulatus Swainson's Thrush TRUE 59 1.00
464 15900 4 Dumetella carolinensis Gray Catbird TRUE 59 1.00
464 16460 1 Leiothlypis peregrina Tennessee Warbler TRUE 59 1.00
464 16540 6 Setophaga americana Northern Parula TRUE 59 1.00
464 16600 1 Setophaga caerulescens Black-throated Blue Warbler TRUE 59 1.00
464 16660 1 Setophaga virens Black-throated Green Warbler TRUE 59 1.00
464 16800 5 Setophaga palmarum Palm Warbler TRUE 59 1.00
464 16880 2 Mniotilta varia Black-and-white Warbler TRUE 59 1.00
464 16890 2 Setophaga ruticilla American Redstart TRUE 59 1.00
464 16930 5 Seiurus aurocapilla Ovenbird TRUE 59 1.00
464 17000 3 Geothlypis trichas Common Yellowthroat TRUE 59 1.00
466 14250 91 Hirundo rustica Barn Swallow TRUE 113 1.00
466 35407 21 Riparia riparia Bank Swallow TRUE 113 1.00
466 40667 1 Hirundo rustica erythrogaster Barn Swallow (American) TRUE 113 1.00
469 35407 909 Riparia riparia Bank Swallow TRUE 909 1.00
475 18920 66 Centronyx henslowii Henslow's Sparrow TRUE 66 1.00
482 18880 135 Passerculus sandwichensis Savannah Sparrow TRUE 135 1.00
484 15830 2 Ixoreus naevius Varied Thrush TRUE 122 1.00
484 18550 61 Pipilo maculatus Spotted Towhee TRUE 122 1.00
484 19050 4 Zonotrichia leucophrys White-crowned Sparrow TRUE 122 1.00
484 19060 55 Zonotrichia atricapilla Golden-crowned Sparrow TRUE 122 1.00
489 19520 33 Dolichonyx oryzivorus Bobolink TRUE 33 1.00
496 15550 15 Catharus fuscescens Veery TRUE 54 1.00
496 15580 1 Catharus ustulatus Swainson's Thrush TRUE 54 1.00
496 15590 38 Catharus guttatus Hermit Thrush TRUE 54 1.00
497 19030 68 Zonotrichia albicollis White-throated Sparrow TRUE 68 1.00
499 18990 4 Melospiza melodia Song Sparrow TRUE 4 1.00
515 15590 1 Catharus guttatus Hermit Thrush TRUE 42 1.00
515 15770 7 Turdus migratorius American Robin TRUE 42 1.00
515 15900 6 Dumetella carolinensis Gray Catbird TRUE 42 1.00
515 15970 3 Toxostoma rufum Brown Thrasher TRUE 42 1.00
515 19030 25 Zonotrichia albicollis White-throated Sparrow TRUE 42 1.00
524 15770 1 Turdus migratorius American Robin TRUE 1 1.00
529 42067 74 Oenanthe oenanthe/seebohmi Northern/Atlas Wheatear TRUE 74 1.00
531 14120 2 Tachycineta bicolor Tree Swallow TRUE 5 1.00
531 18550 2 Pipilo maculatus Spotted Towhee TRUE 5 1.00
531 40824 1 Pipilo maculatus x erythrophthalmus Spotted x Eastern Towhee (hybrid) TRUE 5 1.00
536 15580 20 Catharus ustulatus Swainson's Thrush TRUE 20 1.00
537 15550 4 Catharus fuscescens Veery TRUE 6 1.00
537 17150 2 Cardellina canadensis Canada Warbler TRUE 6 1.00
545 15970 2 Toxostoma rufum Brown Thrasher TRUE 3 1.00
545 16620 1 Setophaga coronata coronata Yellow-rumped Warbler (Myrtle) TRUE 3 1.00
551 13420 9 Vireo solitarius Blue-headed Vireo TRUE 203 1.00
551 15590 45 Catharus guttatus Hermit Thrush TRUE 203 1.00
551 16580 24 Setophaga magnolia Magnolia Warbler TRUE 203 1.00
551 16620 42 Setophaga coronata coronata Yellow-rumped Warbler (Myrtle) TRUE 203 1.00
551 18990 17 Melospiza melodia Song Sparrow TRUE 203 1.00
551 19360 66 Cardinalis cardinalis Northern Cardinal TRUE 203 1.00
570 15580 2 Catharus ustulatus Swainson's Thrush TRUE 2 1.00
604 20910 2 Passer domesticus House Sparrow TRUE 2 1.00
607 15900 2 Dumetella carolinensis Gray Catbird TRUE 10 1.00
607 16930 1 Seiurus aurocapilla Ovenbird TRUE 10 1.00
607 16940 3 Parkesia noveboracensis Northern Waterthrush TRUE 10 1.00
607 17000 1 Geothlypis trichas Common Yellowthroat TRUE 10 1.00
607 17130 2 Setophaga citrina Hooded Warbler TRUE 10 1.00
607 17310 1 Icteria virens Yellow-breasted Chat TRUE 10 1.00
609 12172 7 Empidonax traillii extimus Willow Flycatcher (Southwestern) TRUE 7 1.00
612 35407 100 Riparia riparia Bank Swallow TRUE 100 1.00
617 16830 18 Setophaga cerulea Cerulean Warbler TRUE 18 1.00
619 15580 20 Catharus ustulatus Swainson's Thrush TRUE 41 1.00
619 16600 5 Setophaga caerulescens Black-throated Blue Warbler TRUE 41 1.00
619 16890 16 Setophaga ruticilla American Redstart TRUE 41 1.00
621 16560 8 Setophaga petechia Yellow Warbler TRUE 8 1.00
623 17690 1 Piranga ludoviciana Western Tanager TRUE 3 1.00
623 19410 1 Pheucticus melanocephalus Black-headed Grosbeak TRUE 3 1.00
623 40789 1 Icteria virens auricollis Yellow-breasted Chat (auricollis) TRUE 3 1.00
627 16420 40 Vermivora chrysoptera Golden-winged Warbler TRUE 40 1.00
634 19040 10 Zonotrichia querula Harris's Sparrow TRUE 10 1.00
645 15600 3 Hylocichla mustelina Wood Thrush TRUE 3 1.00
657 15580 7 Catharus ustulatus Swainson's Thrush TRUE 10 1.00
657 19050 3 Zonotrichia leucophrys White-crowned Sparrow TRUE 10 1.00
660 14120 10 Tachycineta bicolor Tree Swallow TRUE 10 1.00
661 35407 24 Riparia riparia Bank Swallow TRUE 24 1.00
678 15600 15 Hylocichla mustelina Wood Thrush TRUE 15 1.00
691 20480 1 Spinus psaltria Lesser Goldfinch TRUE 1 1.00
556 8040 2 NA NA FALSE 13 0.87
556 14230 2 Petrochelidon pyrrhonota Cliff Swallow TRUE 13 0.87
556 14280 1 Poecile atricapillus Black-capped Chickadee TRUE 13 0.87
556 15560 1 Catharus minimus Gray-cheeked Thrush TRUE 13 0.87
556 16330 3 Bombycilla cedrorum Cedar Waxwing TRUE 13 0.87
556 20330 1 Haemorhous purpureus Purple Finch TRUE 13 0.87
556 41384 3 Vireo olivaceus Red-eyed Vireo TRUE 13 0.87
556 45168 2 Junco hyemalis hyemalis/carolinensis Dark-eyed Junco (Slate-colored) TRUE 13 0.87
406 3570 23 NA NA FALSE 143 0.86
406 15580 14 Catharus ustulatus Swainson's Thrush TRUE 143 0.86
406 18760 24 Spizelloides arborea American Tree Sparrow TRUE 143 0.86
406 18920 57 Centronyx henslowii Henslow's Sparrow TRUE 143 0.86
406 19030 1 Zonotrichia albicollis White-throated Sparrow TRUE 143 0.86
406 42218 13 Junco hyemalis Dark-eyed Junco TRUE 143 0.86
406 45168 34 Junco hyemalis hyemalis/carolinensis Dark-eyed Junco (Slate-colored) TRUE 143 0.86
115 16610 1 Setophaga coronata Yellow-rumped Warbler TRUE 5 0.83
115 16920 1 Limnothlypis swainsonii Swainson's Warbler TRUE 5 0.83
115 16960 1 Geothlypis formosa Kentucky Warbler TRUE 5 0.83
115 18760 1 Spizelloides arborea American Tree Sparrow TRUE 5 0.83
115 19500 1 Passerina ciris Painted Bunting TRUE 5 0.83
115 31856 1 NA NA FALSE 5 0.83
520 4450 10 NA NA FALSE 18 0.62
520 4750 1 NA NA FALSE 18 0.62
520 16940 18 Parkesia noveboracensis Northern Waterthrush TRUE 18 0.62
282 3570 22 NA NA FALSE 33 0.60
282 19520 11 Dolichonyx oryzivorus Bobolink TRUE 33 0.60
282 19600 22 Sturnella magna/lilianae Eastern/Chihuahuan Meadowlark TRUE 33 0.60
582 45168 59 Junco hyemalis hyemalis/carolinensis Dark-eyed Junco (Slate-colored) TRUE 59 0.60
582 47282 40 NA NA FALSE 59 0.60
614 4980 28 NA NA FALSE 41 0.48
614 7871 17 NA NA FALSE 41 0.48
614 18570 41 Pipilo erythrophthalmus Eastern Towhee TRUE 41 0.48
4 3761 2 NA NA FALSE 64 0.30
4 3770 3 NA NA FALSE 64 0.30
4 3830 42 NA NA FALSE 64 0.30
4 4630 30 NA NA FALSE 64 0.30
4 4670 74 NA NA FALSE 64 0.30
4 19500 64 Passerina ciris Painted Bunting TRUE 64 0.30
193 7680 36 NA NA FALSE 13 0.25
193 7871 3 NA NA FALSE 13 0.25
193 15600 11 Hylocichla mustelina Wood Thrush TRUE 13 0.25
193 16960 2 Geothlypis formosa Kentucky Warbler TRUE 13 0.25
9 4690 52 NA NA FALSE 5 0.06
9 18940 1 Ammospiza nelsoni Nelson's Sparrow TRUE 5 0.06
9 18950 4 Ammospiza caudacuta Saltmarsh Sparrow TRUE 5 0.06
9 100190 4 NA NA FALSE 5 0.06
9 100250 9 NA NA FALSE 5 0.06
9 100420 3 NA NA FALSE 5 0.06
9 100450 1 NA NA FALSE 5 0.06
9 124422 1 NA NA FALSE 5 0.06
9 124430 3 NA NA FALSE 5 0.06
27 4070 24 NA NA FALSE 0 0.00
27 4100 6 NA NA FALSE 0 0.00
27 4180 14 NA NA FALSE 0 0.00
27 4630 14 NA NA FALSE 0 0.00
27 4690 32 NA NA FALSE 0 0.00
27 4760 58 NA NA FALSE 0 0.00
27 4800 1 NA NA FALSE 0 0.00
27 4820 6 NA NA FALSE 0 0.00
27 5010 21 NA NA FALSE 0 0.00
68 4070 32 NA NA FALSE 0 0.00
68 4100 2 NA NA FALSE 0 0.00
68 4180 16 NA NA FALSE 0 0.00
68 4630 10 NA NA FALSE 0 0.00
68 4680 47 NA NA FALSE 0 0.00
68 4690 9 NA NA FALSE 0 0.00
68 4750 13 NA NA FALSE 0 0.00
68 4760 21 NA NA FALSE 0 0.00
68 4780 2 NA NA FALSE 0 0.00
68 4870 1 NA NA FALSE 0 0.00
68 5000 5 NA NA FALSE 0 0.00
68 5010 19 NA NA FALSE 0 0.00
78 4070 5 NA NA FALSE 0 0.00
78 4180 212 NA NA FALSE 0 0.00
78 4420 4 NA NA FALSE 0 0.00
78 4450 46 NA NA FALSE 0 0.00
78 4530 5 NA NA FALSE 0 0.00
78 4630 13 NA NA FALSE 0 0.00
78 4670 1 NA NA FALSE 0 0.00
78 4680 75 NA NA FALSE 0 0.00
78 4690 475 NA NA FALSE 0 0.00
78 4750 72 NA NA FALSE 0 0.00
78 4760 126 NA NA FALSE 0 0.00
78 4780 1 NA NA FALSE 0 0.00
78 4800 2 NA NA FALSE 0 0.00
78 4820 54 NA NA FALSE 0 0.00
78 4890 37 NA NA FALSE 0 0.00
147 41600 119 NA NA FALSE 0 0.00
156 5520 120 NA NA FALSE 0 0.00
194 7720 104 NA NA FALSE 0 0.00
194 7871 149 NA NA FALSE 0 0.00
271 3110 14 NA NA FALSE 0 0.00
271 3570 33 NA NA FALSE 0 0.00
271 45581 2 NA NA FALSE 0 0.00
276 230 54 NA NA FALSE 0 0.00
307 125480 12 NA NA FALSE 0 0.00
349 4820 150 NA NA FALSE 0 0.00
356 4180 3 NA NA FALSE 0 0.00
356 4840 2 NA NA FALSE 0 0.00
356 40372 5 NA NA FALSE 0 0.00
357 1740 4 NA NA FALSE 0 0.00
357 2170 1 NA NA FALSE 0 0.00
357 2390 4 NA NA FALSE 0 0.00
357 4160 1 NA NA FALSE 0 0.00
357 4380 13 NA NA FALSE 0 0.00
357 4620 4 NA NA FALSE 0 0.00
357 4670 36 NA NA FALSE 0 0.00
357 4700 20 NA NA FALSE 0 0.00
357 4820 12 NA NA FALSE 0 0.00
357 4890 14 NA NA FALSE 0 0.00
357 5460 3 NA NA FALSE 0 0.00
357 41600 13 NA NA FALSE 0 0.00
368 4820 62 NA NA FALSE 0 0.00
382 360 75 NA NA FALSE 0 0.00
390 5280 3 NA NA FALSE 0 0.00
390 45581 1 NA NA FALSE 0 0.00
418 100430 22 NA NA FALSE 0 0.00
420 100430 19 NA NA FALSE 0 0.00
421 100390 1 NA NA FALSE 0 0.00
421 100430 3 NA NA FALSE 0 0.00
421 100450 2 NA NA FALSE 0 0.00
421 100520 2 NA NA FALSE 0 0.00
427 100250 3 NA NA FALSE 0 0.00
427 100450 1 NA NA FALSE 0 0.00
427 100580 1 NA NA FALSE 0 0.00
427 252456 15 NA NA FALSE 0 0.00
432 100430 16 NA NA FALSE 0 0.00
432 100450 2 NA NA FALSE 0 0.00
432 124470 1 NA NA FALSE 0 0.00
434 7680 56 NA NA FALSE 0 0.00
436 32990 4 NA NA FALSE 0 0.00
437 263371 191 NA NA FALSE 0 0.00
448 120210 31 NA NA FALSE 0 0.00
450 1260 192 NA NA FALSE 0 0.00
454 7680 23 NA NA FALSE 0 0.00
458 4750 6 NA NA FALSE 0 0.00
458 4820 101 NA NA FALSE 0 0.00
458 4900 18 NA NA FALSE 0 0.00
468 40261 8 NA NA FALSE 0 0.00
486 1650 3 NA NA FALSE 0 0.00
486 1660 1 NA NA FALSE 0 0.00
486 100190 7 NA NA FALSE 0 0.00
486 100550 2 NA NA FALSE 0 0.00
498 100190 1 NA NA FALSE 0 0.00
498 100230 1 NA NA FALSE 0 0.00
498 100250 1 NA NA FALSE 0 0.00
498 100270 1 NA NA FALSE 0 0.00
498 100430 3 NA NA FALSE 0 0.00
498 100450 1 NA NA FALSE 0 0.00
498 123940 4 NA NA FALSE 0 0.00
498 124470 4 NA NA FALSE 0 0.00
500 100400 20 NA NA FALSE 0 0.00
501 100430 24 NA NA FALSE 0 0.00
525 7810 6 NA NA FALSE 0 0.00
532 100190 4 NA NA FALSE 0 0.00
532 100230 5 NA NA FALSE 0 0.00
532 100270 3 NA NA FALSE 0 0.00
532 100420 6 NA NA FALSE 0 0.00
532 100430 63 NA NA FALSE 0 0.00
532 100450 1 NA NA FALSE 0 0.00
533 252456 38 NA NA FALSE 0 0.00
534 7680 3 NA NA FALSE 0 0.00
534 47282 2 NA NA FALSE 0 0.00
541 7680 6 NA NA FALSE 0 0.00
554 125480 55 NA NA FALSE 0 0.00
555 257061 8 NA NA FALSE 0 0.00
555 257202 1 NA NA FALSE 0 0.00
563 100190 3 NA NA FALSE 0 0.00
563 100250 2 NA NA FALSE 0 0.00
563 100380 5 NA NA FALSE 0 0.00
563 100560 1 NA NA FALSE 0 0.00
563 124432 1 NA NA FALSE 0 0.00
585 3580 2 NA NA FALSE 0 0.00
586 4990 15 NA NA FALSE 0 0.00
608 7720 2 NA NA FALSE 0 0.00
608 7871 46 NA NA FALSE 0 0.00
611 257184 1 NA NA FALSE 0 0.00
611 257201 1 NA NA FALSE 0 0.00
611 257216 5 NA NA FALSE 0 0.00
632 7871 20 NA NA FALSE 0 0.00
635 100190 1 NA NA FALSE 0 0.00
635 100230 2 NA NA FALSE 0 0.00
635 100430 1 NA NA FALSE 0 0.00
644 100300 1 NA NA FALSE 0 0.00
644 121401 1 NA NA FALSE 0 0.00
646 100430 10 NA NA FALSE 0 0.00
646 124470 10 NA NA FALSE 0 0.00
648 124240 2 NA NA FALSE 0 0.00
648 125480 14 NA NA FALSE 0 0.00
649 230 30 NA NA FALSE 0 0.00
651 100680 40 NA NA FALSE 0 0.00
664 7680 51 NA NA FALSE 0 0.00
667 4990 5 NA NA FALSE 0 0.00
672 7871 45 NA NA FALSE 0 0.00
674 200 75 NA NA FALSE 0 0.00
688 100240 3 NA NA FALSE 0 0.00
688 100270 5 NA NA FALSE 0 0.00
688 100430 8 NA NA FALSE 0 0.00
688 100530 2 NA NA FALSE 0 0.00
690 100430 10 NA NA FALSE 0 0.00
690 100460 7 NA NA FALSE 0 0.00

Summarize

Now we can summarize these projects by how many species, deployments (tags) and the average number of deployments per species.

We’ll aim to include projects with a bread of species but also reasonable coverage, so we exclude projects with less than 100% passerines and fewer than three species.

p <- p_sp |>
  filter(prop_good_tags == 1) |>
  group_by(tagProjectID) |>
  summarize(total_tags = sum(num_deployments),
            n_species = n_distinct(speciesID),
            mean_tags_per_species = mean(num_deployments),
            species = list(unique(english_name))) |>
  filter(n_species > 3) |>
  arrange(desc(mean_tags_per_species), desc(n_species), desc(total_tags))

gt(p) |>
  fmt_number(columns = "mean_tags_per_species", decimals = 1) |>
  gt_theme()
tagProjectID total_tags n_species mean_tags_per_species species
417 226 4 56.5 Ovenbird, Song Sparrow, White-throated Sparrow, Red-eyed Vireo
551 203 6 33.8 Blue-headed Vireo, Hermit Thrush, Magnolia Warbler, Yellow-rumped Warbler (Myrtle), Song Sparrow, Northern Cardinal
484 122 4 30.5 Varied Thrush, Spotted Towhee, White-crowned Sparrow, Golden-crowned Sparrow
352 257 9 28.6 Swainson's Thrush, Gray Catbird, Spotted Towhee, Vesper Sparrow, Grasshopper Sparrow, Lazuli Bunting, Western Meadowlark, Brown-headed Cowbird, Pine Siskin
393 130 5 26.0 Horned Lark, Sprague's Pipit, Baird's Sparrow, Thick-billed Longspur, Chestnut-collared Longspur
364 55 4 13.8 Blackpoll Warbler, Pine Grosbeak, Purple Finch, Pine Siskin
515 42 5 8.4 Hermit Thrush, American Robin, Gray Catbird, Brown Thrasher, White-throated Sparrow
464 59 12 4.9 Veery, Swainson's Thrush, Gray Catbird, Tennessee Warbler, Northern Parula, Black-throated Blue Warbler, Black-throated Green Warbler, Palm Warbler, Black-and-white Warbler, American Redstart, Ovenbird, Common Yellowthroat
607 10 6 1.7 Gray Catbird, Ovenbird, Northern Waterthrush, Common Yellowthroat, Hooded Warbler, Yellow-breasted Chat

Data sizes

Now, we can check the amount of data per project (see what we’re in for!)

For reference, 7,006,847,799 bytes is ~ 7 GB

dir.create("Data/Temp")
status <- map(
  set_names(p$tagProjectID), 
  \(x) tellme(x, dir = "Data/Temp",  new = TRUE)) |>
  list_rbind(names_to = "proj_id")
unlink("Data/Temp", recursive = TRUE)

So this, isn’t too bad, data-wise, I think we could use all projects.

status |> 
  mutate(Megabytes = numBytes / 1000000) |>
  arrange(desc(numBytes)) |>
  gt() |>
  fmt_number(decimals = 0)  |>
  gt_theme()
proj_id numHits numBytes numRuns numBatches numGPS Megabytes
551 74,368,213 7,006,847,799 1,283,286 3,059 386,210 7,007
352 70,567,612 6,655,989,412 1,053,008 39,044 659,503 6,656
417 47,585,940 4,500,299,234 1,007,044 2,647 379,951 4,500
484 25,653,624 2,470,082,118 1,076,391 3,515 495,525 2,470
515 2,198,407 256,103,381 425,980 4,517 572,886 256
393 1,444,599 160,645,217 80,053 4,799 447,871 161
364 94,467 25,729,831 17,000 1,888 330,579 26
464 80,673 21,734,415 34,752 1,873 253,377 22
607 939 1,544,997 456 147 29,634 2

Reproducibility

devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.0 (2024-04-24)
 os       Ubuntu 22.04.4 LTS
 system   x86_64, linux-gnu
 ui       X11
 language en_CA:en
 collate  en_CA.UTF-8
 ctype    en_CA.UTF-8
 tz       America/Winnipeg
 date     2024-07-04
 pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 ! package       * version  date (UTC) lib source
 P arrow         * 16.1.0   2024-05-25 [?] CRAN (R 4.4.0)
 P assertr       * 3.0.1    2023-11-23 [?] CRAN (R 4.4.0)
 P assertthat      0.2.1    2019-03-21 [?] CRAN (R 4.4.0)
 P bit             4.0.5    2022-11-15 [?] CRAN (R 4.4.0)
 P bit64           4.0.5    2020-08-30 [?] CRAN (R 4.4.0)
 P blob            1.2.4    2023-03-17 [?] CRAN (R 4.4.0)
 P cachem          1.1.0    2024-05-16 [?] CRAN (R 4.4.0)
 P cellranger      1.1.0    2016-07-27 [?] CRAN (R 4.4.0)
 P class           7.3-22   2023-05-03 [?] CRAN (R 4.3.1)
 P classInt        0.4-10   2023-09-05 [?] CRAN (R 4.4.0)
 P cli             3.6.2    2023-12-11 [?] CRAN (R 4.4.0)
 P codetools       0.2-19   2023-02-01 [?] CRAN (R 4.2.2)
 P colorspace      2.1-0    2023-01-23 [?] CRAN (R 4.4.0)
 P DBI           * 1.2.3    2024-06-02 [?] CRAN (R 4.4.0)
 P dbplyr          2.5.0    2024-03-19 [?] CRAN (R 4.4.0)
 P devtools        2.4.5    2022-10-11 [?] CRAN (R 4.4.0)
 P digest          0.6.35   2024-03-11 [?] CRAN (R 4.4.0)
 P dplyr         * 1.1.4    2023-11-17 [?] CRAN (R 4.4.0)
 P e1071           1.7-14   2023-12-06 [?] CRAN (R 4.4.0)
 P ebirdst       * 3.2022.3 2024-03-05 [?] CRAN (R 4.4.0)
 P ellipsis        0.3.2    2021-04-29 [?] CRAN (R 4.4.0)
 P evaluate        0.23     2023-11-01 [?] CRAN (R 4.4.0)
 P fansi           1.0.6    2023-12-08 [?] CRAN (R 4.4.0)
 P fastmap         1.2.0    2024-05-15 [?] CRAN (R 4.4.0)
 P fs              1.6.4    2024-04-25 [?] CRAN (R 4.4.0)
 P furrr         * 0.3.1    2022-08-15 [?] CRAN (R 4.4.0)
 P future        * 1.33.2   2024-03-26 [?] CRAN (R 4.4.0)
 P generics        0.1.3    2022-07-05 [?] CRAN (R 4.4.0)
 P ggplot2       * 3.5.1    2024-04-23 [?] CRAN (R 4.4.0)
 P ggrepel       * 0.9.5    2024-01-10 [?] CRAN (R 4.4.0)
 P ggspatial     * 1.1.9    2023-08-17 [?] CRAN (R 4.4.0)
 P globals         0.16.3   2024-03-08 [?] CRAN (R 4.4.0)
 P glue            1.7.0    2024-01-09 [?] CRAN (R 4.4.0)
 P gt            * 0.10.1   2024-01-17 [?] CRAN (R 4.4.0)
 P gtable          0.3.5    2024-04-22 [?] CRAN (R 4.4.0)
 P hms             1.1.3    2023-03-21 [?] CRAN (R 4.4.0)
 P htmltools       0.5.8.1  2024-04-04 [?] CRAN (R 4.4.0)
 P htmlwidgets     1.6.4    2023-12-06 [?] CRAN (R 4.4.0)
 P httpuv          1.6.15   2024-03-26 [?] CRAN (R 4.4.0)
 P httr            1.4.7    2023-08-15 [?] CRAN (R 4.4.0)
 P jsonlite        1.8.8    2023-12-04 [?] CRAN (R 4.4.0)
 P KernSmooth      2.23-22  2023-07-10 [?] CRAN (R 4.3.1)
 P knitr           1.47     2024-05-29 [?] CRAN (R 4.4.0)
 P later           1.3.2    2023-12-06 [?] CRAN (R 4.4.0)
 P lifecycle       1.0.4    2023-11-07 [?] CRAN (R 4.4.0)
 P listenv         0.9.1    2024-01-29 [?] CRAN (R 4.4.0)
 P lubridate     * 1.9.3    2023-09-27 [?] CRAN (R 4.4.0)
 P lutz          * 0.3.2    2023-10-17 [?] CRAN (R 4.4.0)
 P magrittr        2.0.3    2022-03-30 [?] CRAN (R 4.4.0)
 P memoise         2.0.1    2021-11-26 [?] CRAN (R 4.4.0)
 P mime            0.12     2021-09-28 [?] CRAN (R 4.4.0)
 P miniUI          0.1.1.1  2018-05-18 [?] CRAN (R 4.4.0)
 P motus         * 6.1.0    2024-05-02 [?] Github (motuswts/motus@a53a8b8)
 P munsell         0.5.1    2024-04-01 [?] CRAN (R 4.4.0)
 P naturecounts    0.4.0    2024-05-02 [?] Github (birdscanada/naturecounts@a6e52da)
 P parallelly      1.37.1   2024-02-29 [?] CRAN (R 4.4.0)
 P patchwork     * 1.2.0    2024-01-08 [?] CRAN (R 4.4.0)
 P pillar          1.9.0    2023-03-22 [?] CRAN (R 4.4.0)
 P pkgbuild        1.4.4    2024-03-17 [?] CRAN (R 4.4.0)
 P pkgconfig       2.0.3    2019-09-22 [?] CRAN (R 4.4.0)
 P pkgload         1.3.4    2024-01-16 [?] CRAN (R 4.4.0)
 P profvis         0.3.8    2023-05-02 [?] CRAN (R 4.4.0)
 P promises        1.3.0    2024-04-05 [?] CRAN (R 4.4.0)
 P proxy           0.4-27   2022-06-09 [?] CRAN (R 4.4.0)
 P purrr         * 1.0.2    2023-08-10 [?] CRAN (R 4.4.0)
 P R6              2.5.1    2021-08-19 [?] CRAN (R 4.4.0)
 P Rcpp            1.0.12   2024-01-09 [?] CRAN (R 4.4.0)
 P readr         * 2.1.5    2024-01-10 [?] CRAN (R 4.4.0)
 P readxl        * 1.4.3    2023-07-06 [?] CRAN (R 4.4.0)
 P remotes         2.5.0    2024-03-17 [?] CRAN (R 4.4.0)
   renv            1.0.7    2024-04-11 [1] CRAN (R 4.4.0)
 P rlang           1.1.3    2024-01-10 [?] CRAN (R 4.4.0)
 P rmarkdown       2.27     2024-05-17 [?] CRAN (R 4.4.0)
 P rnaturalearth * 1.0.1    2023-12-15 [?] CRAN (R 4.4.0)
 P RSQLite         2.3.6    2024-03-31 [?] CRAN (R 4.4.0)
 P rstudioapi      0.16.0   2024-03-24 [?] CRAN (R 4.4.0)
 P sass            0.4.9    2024-03-15 [?] CRAN (R 4.4.0)
 P scales          1.3.0    2023-11-28 [?] CRAN (R 4.4.0)
 P sessioninfo     1.2.2    2021-12-06 [?] CRAN (R 4.4.0)
 P sf            * 1.0-16   2024-03-24 [?] CRAN (R 4.4.0)
 P shiny           1.8.1.1  2024-04-02 [?] CRAN (R 4.4.0)
 P stringi         1.8.4    2024-05-06 [?] CRAN (R 4.4.0)
 P stringr       * 1.5.1    2023-11-14 [?] CRAN (R 4.4.0)
 P terra           1.7-71   2024-01-31 [?] CRAN (R 4.4.0)
 P tibble        * 3.2.1    2023-03-20 [?] CRAN (R 4.4.0)
 P tidyr         * 1.3.1    2024-01-24 [?] CRAN (R 4.4.0)
 P tidyselect      1.2.1    2024-03-11 [?] CRAN (R 4.4.0)
 P timechange      0.3.0    2024-01-18 [?] CRAN (R 4.4.0)
 P tzdb            0.4.0    2023-05-12 [?] CRAN (R 4.4.0)
 P units         * 0.8-5    2023-11-28 [?] CRAN (R 4.4.0)
 P urlchecker      1.0.1    2021-11-30 [?] CRAN (R 4.4.0)
 P usethis         2.2.3    2024-02-19 [?] CRAN (R 4.4.0)
 P utf8            1.2.4    2023-10-22 [?] CRAN (R 4.4.0)
 P vctrs           0.6.5    2023-12-01 [?] CRAN (R 4.4.0)
 P withr           3.0.0    2024-01-16 [?] CRAN (R 4.4.0)
 P xfun            0.44     2024-05-15 [?] CRAN (R 4.4.0)
 P xml2            1.3.6    2023-12-04 [?] CRAN (R 4.4.0)
 P xtable          1.8-4    2019-04-21 [?] CRAN (R 4.4.0)
 P yaml            2.3.8    2023-12-11 [?] CRAN (R 4.4.0)

 [1] /home/steffi/Projects/Business/Barbara Frei/urban_motus/renv/library/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu
 [2] /home/steffi/.cache/R/renv/sandbox/linux-ubuntu-jammy/R-4.4/x86_64-pc-linux-gnu/9a444a72

 P ── Loaded and on-disk path mismatch.

──────────────────────────────────────────────────────────────────────────────
Back to top