-
Notifications
You must be signed in to change notification settings - Fork 0
Description
In the 'Reasons why packages are archived on CRAN' blog post on 2022-05-10, @llrs shows how get metadata on different CRAN packages events, including archiving and unarchiving of packages, directly from CRAN. Specifically, this data is available in https://cran.r-project.org/src/contrib/PACKAGES.in.
One of the results of this 2022 study, was:
"This suggests that once a package is archived maintainers do not make the effort to put it back on CRAN except on very few cases were there are multiple attempts. To check we can see the current available packages and see how many of those are still present on CRAN:
CRAN | Packages | Proportion |
---|---|---|
no | 3869 | 64% |
yes | 2183 | 36% |
Many packages are currently on CRAN despite their past archivation but close to 64% are currently not on CRAN.", which vice versa means 36% of archived packages return to CRAN.
In a Bioconductor Slack thread on 2024-03-05 (https://community-bioc.slack.com/archives/CLF37V6C8/p1709643793615939?thread_ts=1709600683.154139&cid=CLF37V6C8), @llrs added:
"Yes, 36% of all packages archived returned to CRAN (when I created the post). As time goes this % will lower, and also it could mean that a package was archived, then returned and then was archived for good. The time they were archived could be calculated comparing the archive and current dates and the date when they were archived. This is relatively trivial to do and could provide some estimation for CRANhaven."
It would be interesting to get the raw data for how long "returning" packages are archived. This information should be possible to retrieve from https://cran.r-project.org/src/contrib/PACKAGES.in because its entries carry information on the type of event and when it took place. Two examples are:
Package: jlmerclusterperm
X-CRAN-History: Archived on 2024-02-29 for policy violation.
.
Does not clean up use of cache.
Unarchived on 2024-03-04.
and
Package: BFS
X-CRAN-History: Archived on 2022-06-14 as check problems were not corrected in time.
Unarchived on 2022-09-07.
Archived on 2024-01-24 as requires archived package 'pxweb'.
Unarchived on 2024-02-02.
Archived on 2024-02-17 for policy violation.
.
On Internet access (429 error).
Unarchived on 2024-02-24.
With this raw data, we can estimate the distribution of how long packages falls off CRAN before returning.
We could also add annotation to each archived packages with information on why it was archived. For instance, CRANhaven could also serve as a dashboard to get an overview of why packages are no longer available, as an alternative to going into the each CRAN package page.