The observations of the training set will be slices with a given length of the raw wave files. For instance, if an original wave file has duration of 55 seconds, then the slicing with interval of 1 second and no overlap will result in 55 disjoint 1 second long slices.

library(mestrado)

wav_dir <- system.file("wav_sample", package = "mestrado")
temp_dir <- tempdir()

slices_path <- slice_wavs(wav_dir, temp_dir)
slices_path
#> [1] "/tmp/RtmpkLsiuB"

slices <- list.files(slices_path)
slices[4:7]
#> [1] "Megascops-atricapilla-1261496@0@0@.wav"
#> [2] "Megascops-atricapilla-1393458@0@0@.wav"
#> [3] "Megascops-choliba-118111@0@0@.wav"     
#> [4] "Megascops-choliba-1891062@0@0@.wav"

The resulting file names was designed to make it “parser friendly”. It goes well with tidyr::separate(sep = "@"). This data wis useful when matching with the annotations of the presense/absensce of a bird song or any type of event of interest.

library(tidyverse)
slices_metadata <- tibble(
  file_name = slices
) %>%
  tidyr::separate(file_name, c("species", "start", "end"), sep = "@")

slices_metadata %>% head()
#> # A tibble: 6 x 3
#>   species                       start end  
#>   <chr>                         <chr> <chr>
#> 1 file34a2723b4d29.so           <NA>  <NA> 
#> 2 file34a27391be01.txt          <NA>  <NA> 
#> 3 Glaucidium-minutissimum-24426 0     0    
#> 4 Megascops-atricapilla-1261496 0     0    
#> 5 Megascops-atricapilla-1393458 0     0    
#> 6 Megascops-choliba-118111      0     0