Monash University, Australia
ECSS Miniconference 2022
2022 Nov 17
A third year PhD student at Monash University, Melbourne, Australia
My research centers on exploring multivariate spatio-temporal data with data wrangling and visualisation tool.
Find me on
huizezhangsh
,huizezhang-sherry
, andhttps://huizezhangsh.netlify.app/
Inefficient memory use,
repeated information,
especially when large geometry objects are combined with frequent temporal data (daily or weekly).
Cubble is a nested object built on tibble that allow easy pivoting between spatial and temporal form.
spatial <- stations %>%
{{ Your spatial analysis }}
##############################
# more subsetting step if temporal analysis
# depends on spatial results
sp_id <- spatial %>% pull(id)
ts_subset <- ts %>% filter(id %in% sp_id)
##############################
temporal <- ts_subset %>%
{{ Your temporal analysis }}
##############################
# more subsetting step if spatial analysis
# depends on temporal results
ts_id <- temporal %>% pull(id)
sp_subset <- spatial %>% filter(id %in% ts_id)
##############################
sp_subset %>%
{{ Your spatial analysis }}
# A tibble: 30 × 6
id lat long elev name wmo_id
<chr> <dbl> <dbl> <dbl> <chr> <dbl>
1 ASN00060139 -31.4 153. 4.2 port macquarie airport aws 94786
2 ASN00068228 -34.4 151. 10 bellambi aws 94749
3 ASN00017123 -28.1 140. 37.8 moomba airport 95481
4 ASN00081049 -36.4 145. 114 tatura inst sustainable ag 95836
5 ASN00018201 -32.5 138. 14 port augusta aero 95666
# … with 25 more rows
(weather <- as_cubble(
list(spatial = stations, temporal = ts),
key = id, index = date, coords = c(long, lat)
))
# cubble: id [30]: nested form
# bbox: [114.09, -41.88, 152.87, -11.65]
# temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
id lat long elev name wmo_id ts
<chr> <dbl> <dbl> <dbl> <chr> <dbl> <list>
1 ASN00003057 -16.5 123. 7 cygnet bay 94201 <tibble [316 × 4]>
2 ASN00005007 -22.2 114. 5 learmonth airport 94302 <tibble [363 × 4]>
3 ASN00005084 -21.5 115. 5 thevenard island 94303 <tibble [366 × 4]>
4 ASN00010515 -32.1 117. 199 beverley 95615 <tibble [354 × 4]>
5 ASN00012314 -27.8 121. 497 leinster aero 95448 <tibble [366 × 4]>
# … with 25 more rows
stations
) can be an sf
object and temporal data (ts
) can be a tsibble
object.long form
# cubble: date, id [30]: long form
# bbox: [114.09, -41.88, 152.87, -11.65]
# spatial: lat [dbl], long [dbl], elev [dbl],
# name [chr], wmo_id [dbl]
id date prcp tmax tmin
<chr> <date> <dbl> <dbl> <dbl>
1 ASN00003057 2020-01-01 0 36.7 26.9
2 ASN00003057 2020-01-02 41 34.2 24
3 ASN00003057 2020-01-03 0 35 25.4
4 ASN00003057 2020-01-04 40 29.1 25.4
5 ASN00003057 2020-01-05 1640 27.3 24.3
# … with 10,627 more rows
back to the nested form:
# cubble: id [30]: nested form
# bbox: [114.09, -41.88, 152.87, -11.65]
# temporal: date [date], prcp [dbl], tmax [dbl],
# tmin [dbl]
id lat long elev name wmo_id ts
<chr> <dbl> <dbl> <dbl> <chr> <dbl> <list>
1 ASN0000… -16.5 123. 7 cygn… 94201 <tibble>
2 ASN0000… -22.2 114. 5 lear… 94302 <tibble>
3 ASN0000… -21.5 115. 5 thev… 94303 <tibble>
4 ASN0001… -32.1 117. 199 beve… 95615 <tibble>
5 ASN0001… -27.8 121. 497 lein… 95448 <tibble>
# … with 25 more rows
[1] TRUE
Reference temporal variables with $
# cubble: id [30]: nested form
# bbox: [114.09, -41.88, 152.87, -11.65]
# temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
id lat long elev name wmo_id ts avg_tmax
<chr> <dbl> <dbl> <dbl> <chr> <dbl> <list> <dbl>
1 ASN00003057 -16.5 123. 7 cygnet bay 94201 <tibble [316 × 4]> 32.4
2 ASN00005007 -22.2 114. 5 learmonth airport 94302 <tibble [363 × 4]> 33.2
3 ASN00005084 -21.5 115. 5 thevenard island 94303 <tibble [366 × 4]> 30.7
4 ASN00010515 -32.1 117. 199 beverley 95615 <tibble [354 × 4]> 26.4
5 ASN00012314 -27.8 121. 497 leinster aero 95448 <tibble [366 × 4]> 29.6
# … with 25 more rows
Move spatial variables into the long form
# cubble: date, id [30]: long form
# bbox: [114.09, -41.88, 152.87, -11.65]
# spatial: lat [dbl], long [dbl], elev [dbl], name [chr], wmo_id [dbl]
id date prcp tmax tmin long lat
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ASN00003057 2020-01-01 0 36.7 26.9 123. -16.5
2 ASN00003057 2020-01-02 41 34.2 24 123. -16.5
3 ASN00003057 2020-01-03 0 35 25.4 123. -16.5
4 ASN00003057 2020-01-04 40 29.1 25.4 123. -16.5
5 ASN00003057 2020-01-05 1640 27.3 24.3 123. -16.5
# … with 10,627 more rows
cb <- as_cubble(
list(spatial = stations, temporal = ts),
key = id, index = date, coords = c(long, lat)
)
set.seed(0927)
cb_glyph <- cb %>%
slice_sample(n = 20) %>%
face_temporal() %>%
mutate(month = lubridate::month(date)) %>%
group_by(month) %>%
summarise(tmax = mean(tmax, na.rm = TRUE)) %>%
unfold(long, lat)
ggplot() +
geom_sf(data = oz_simp,
fill = "grey95",
color = "white") +
geom_glyph(
data = cb_glyph,
aes(x_major = long, x_minor = month,
y_major = lat, y_minor = tmax),
width = 2, height = 0.7) +
ggthemes::theme_map()
The slides are made with Quarto
All the materials used to prepare the slides are available at sherryzhang-ecssmini2022.netlify.app
Wickham, H., Hofmann, H., Wickham, C., & Cook, D. (2012). Glyph‐maps for visually exploring temporal patterns in climate data and models. Environmetrics, 23(5), 382-393: https://vita.had.co.nz/papers/glyph-maps.pdf