Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tm_player_transfer_history() failing due to not being available in the HTML of transfermarkt #342

Closed
JaseZiv opened this issue Nov 12, 2023 · 3 comments · Fixed by #343
Closed
Assignees
Labels
bug Something isn't working

Comments

@JaseZiv
Copy link
Owner

JaseZiv commented Nov 12, 2023

Without using some form of browser automation, player transfer histories are no longer able to be scraped by tm_player_transfer_history() in its current form.

Will open this issue and try to incorporate the work @tonyelhabr did using chromote to obtain certain FBREF data points.

@JaseZiv JaseZiv added the bug Something isn't working label Nov 12, 2023
@tonyelhabr
Copy link
Collaborator

Without using some form of browser automation, player transfer histories are no longer able to be scraped by tm_player_transfer_history() in its current form.

Will open this issue and try to incorporate the work @tonyelhabr did using chromote to obtain certain FBREF data points.

I tried out the chromote approach and found that I'm getting blocked upon loading a player URL.

session <- worldfootballR:::worldfootballr_chromote_session("https://www.transfermarkt.com/cristiano-ronaldo/profil/spieler/8198")
session$session$view()

image

I did find that there is an API call that we can make to get some of the transfer history elements, although I'm not sure how we'll get some things like from_country and to_country.

library(worldfootballR)
library(httr)
#> Warning: package 'httr' was built under R version 4.2.3
headers = c(
  `User-Agent` = getOption("worldfootballR.agent")
)

res <- httr::GET(
  url = "https://www.transfermarkt.com/ceapi/transferHistory/list/8198",
  httr::add_headers(.headers = headers)
)

cont <- content(res)
transfers <- cont$transfers
str(transfers[1:2], max.level = 2)
#> List of 2
#>  $ :List of 12
#>   ..$ url               : chr "/cristiano-ronaldo/transfers/spieler/8198/transfer_id/4197140"
#>   ..$ from              :List of 7
#>   ..$ to                :List of 7
#>   ..$ futureTransfer    : int 0
#>   ..$ date              : chr "Jan 1, 2023"
#>   ..$ dateUnformatted   : chr "2023-01-01"
#>   ..$ upcoming          : logi FALSE
#>   ..$ season            : chr "22/23"
#>   ..$ marketValue       : chr "€20.00m"
#>   ..$ fee               : chr "-"
#>   ..$ showUpcomingHeader: logi FALSE
#>   ..$ showResetHeader   : logi FALSE
#>  $ :List of 12
#>   ..$ url               : chr "/cristiano-ronaldo/transfers/spieler/8198/transfer_id/4152208"
#>   ..$ from              :List of 7
#>   ..$ to                :List of 7
#>   ..$ futureTransfer    : int 0
#>   ..$ date              : chr "Nov 22, 2022"
#>   ..$ dateUnformatted   : chr "2022-11-22"
#>   ..$ upcoming          : logi FALSE
#>   ..$ season            : chr "22/23"
#>   ..$ marketValue       : chr "€20.00m"
#>   ..$ fee               : chr "-"
#>   ..$ showUpcomingHeader: logi FALSE
#>   ..$ showResetHeader   : logi FALSE

@tonyelhabr tonyelhabr self-assigned this Nov 22, 2023
@tonyelhabr
Copy link
Collaborator

Upon a GitHub search, I found that a python package made a similar fix in the past 2 weeks. Here is their code for scraping history.

@tonyelhabr
Copy link
Collaborator

Oh, so I think we can still get the "extra info" from server-side loaded data. So we may actually be capable of returning the same data from the function as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants