One option could be to retrieve the text from the correct HTML-Elements. Sadly, this website is not very basic with it's layout

- The days have to be treated differently, because the website structures the dates in such an atrotious way. E.g. there are cases like
Tue 22 where two cruises depart on the same day which has to be handeled. Not every day has a month within the same div, but rather the month is some divs on top on the SAME level
- Another difficulty is, that the cruise lines on the website are not noted as text but rather as images. So these have to be retrieved using the
alt attribute
library(rvest)
library(tidyverse)
page <- read_html('https://www.cruisetimetables.com/invergordon-scotland-cruise-ship-schedule-2025.html')
# get day and month
day <- page %>%
html_elements("div[class^='psovde-day'], div[class^='psovde-month']") %>%
html_text2() %>%
gsub("[^[:alnum:]]", "", .)
day <- day[-1]
day[day == ""] <- NA
df_dates <- as.data.frame(day) %>%
fill(day, .direction = "down") %>%
mutate(month = ifelse(grepl("^[A-Za-z]+$", day), day, NA)) %>%
fill(month, .direction = "down") %>%
filter(day != month)
cruise_line <- page %>%
html_elements("div.psovde-cruiseline img") %>%
html_attr("alt") %>%
gsub(" logo", "", .)
ship <- page %>%
html_elements("div[class^='psovde-ship']") %>%
html_text2() %>%
.[!grepl("\r", .)]
times <- page %>%
html_elements("div[class^='psovde-times']") %>%
html_text2() %>%
.[!grepl("\r", .)]
passengers <- page %>%
html_elements("div[class^='psovde-passengers']") %>%
html_text2() %>%
.[!grepl("\r", .)]
finalData <- data.frame(
day = df_dates$day,
month = df_dates$month,
cruise_line = cruise_line,
ship = ship,
times = times,
passengers = passengers
)
giving
day month cruise_line ship times passengers
1 Wed16 April AIDA AIDAsol a 1000 d 2000 2174
2 Thu17 April CFC Croisieres Renaissance a 0700 d 1800 1358
3 Wed23 April AIDA AIDAsol a 1000 d 1900 2174
4 Tue29 April Phoenix Reisen Amera a 0800 d 2000 834
5 Sat3 May AIDA AIDAluna a 0800 d 1800 2050
6 Tue6 May TUI Cruises Mein Schiff 3 a 0730 d 1900 2506
Which you can write as CSV with write.csv(finalData, "cruises.csv")