Skip to content

add template specializations to avoid DATAPTR use #1310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 22, 2024

Conversation

kevinushey
Copy link
Contributor

@kevinushey kevinushey commented Jun 11, 2024

Addresses #1308, but note that this PR would turn these out-of-bounds accesses into R errors, and so we might expect to see some CRAN failures. If that feels too aggressive, we could make these warnings instead, or control the behavior with an environment variable (read on package load), or something else.

This PR seeks to avoid some of the crashes seen in packages using Rcpp, where out-of-bounds accessed could trigger an ASAN misaligned read (due to special behavior in DATAPTR attempting to catch such usages).

This PR also now adds bounds checking to offset accessing via operator[] for the various Vector classes provided by Rcpp; controlled via the same RCPP_NO_BOUNDS_CHECK macro used by the subsetting code.

For example:

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double get(NumericVector x, R_xlen_t index) {
    return x[index];
}

/*** R
get(42, 2)
*/

Would now give:

> get(42, 2)
Error in eval(ei, envir) : 
  subscript out of bounds (index 2 >= vector size 1)

Checklist

  • Code compiles correctly
  • R CMD check still passes all tests
  • Preferably, new tests were added which fail without the change
  • Document the changes by file in ChangeLog

@kevinushey kevinushey requested a review from eddelbuettel June 11, 2024 23:13
@eddelbuettel
Copy link
Member

Nice stuff. Would you mind if I spend the usual day or two and a half on rev.deps, "just in case" ?

@Enchufa2
Copy link
Member

Nice implementation, LGTM! Given that these now are going to be ASAN errors on CRAN anyways, I'd say it's fine if they trigger errors from our side.

@eddelbuettel
Copy link
Member

Given that these now are going to be ASAN errors on CRAN anyways, I'd say it's fine if they trigger errors from our side.

Very true but it may make updating at CRAN harder for us if we trigger N new things in M packages.

@eddelbuettel eddelbuettel force-pushed the bugfix/r-vector-start-specializations branch from 86cf6ba to 69cb4dd Compare June 12, 2024 13:27
@kevinushey
Copy link
Contributor Author

Nice stuff. Would you mind if I spend the usual day or two and a half on rev.deps, "just in case" ?

For sure -- we could also consider whether we should allow users (or CRAN) to configure whether out-of-bounds checks should be errors, warnings, or something else.

@eddelbuettel
Copy link
Member

eddelbuettel commented Jun 12, 2024

Three hours in (had some follow-up to do on RcppArmadillo run first)

Test of Rcpp 1.0.12.3.1.1 had 142 successes, 3 failures, and 1 skipped packages.

2 of the 3 are out of bounds (aum, BART) the third is small numerical diff on test so maybe not us (bayesianVars).

PS1: Twelve hours in:

Test of Rcpp 1.0.12.3.1.1 had 723 successes, 6 failures, and 6 skipped packages

PS2: Twentyfour hours in

Test of Rcpp 1.0.12.3.1.1 had 1514 successes, 16 failures, and 14 skipped packages.

@bbolker
Copy link
Contributor

bbolker commented Jun 12, 2024

Will look forward to seeing whether this resolves lme4/lme4#794 (still hoping that what we're seeing is not the revealing of previously cryptic out-of-bounds accesses, but rather things that are legal but trigger an ASAN warning anyway ...)

@eddelbuettel
Copy link
Member

The reverse depends run finished just as I was leaving for a two-day bike trip. The results are not terrible but also not something which makes me think we can commit this as is:

Test of Rcpp 1.0.12.3.1.1 had 2787 successes, 40 failures, and 25 skipped packages.
Ran from 2024-06-12 08:35:10.73 to 2024-06-14 06:50:59.41 for 1.928 days
Average of 58.397 secs relative to 311.843 secs using 7 runners

Failed packages:  aum, BART, bayesianVARs, birdie, chouca, cna, fixest, FLOPART, FLSSS, HistDAWass, kdevine, LOPART, magi, maxnodf, mined, mixvlmc, nanotime, netdiffuseR, nlmixr2est, o2geosocial, ondisc, onemap\
, openxlsx, pedmod, PKPDsim, Rborist, reservr, revdbayes, rsparse, rstpm2, rxode2, secr, supc, thunder, TreeDist, TSrepr, VAJointSurv, vapour, xrnet, ZIPBayes

Almost all the 'fails' are due to run-time and not install-time so it could be the 'harsher' out of bounds access. I will try to summarize later.

@eddelbuettel
Copy link
Member

eddelbuettel commented Jun 16, 2024

@kevinushey With the extra delay of me disappearing for 1 1/2 or 2 days 😀 I was now able to summarize the reverse dependency run. As logged, it is essentially 'out of bounds'. So we probably want to preemptively define the 'death by bounds check' away, or maybe tone it down to a message.

PS I played a little with a simple define to avoid the current stop() but it gets me into different trouble and segfaults.

@Enchufa2
Copy link
Member

Most of them already have ASAN/UBSAN errors on CRAN, and the others will likely have them when the CRAN machinery reaches them.

@eddelbuettel
Copy link
Member

eddelbuettel commented Jun 17, 2024

I am much less sanguine here as we are (by the standard release pattern) looking into a release in two week (early July) to four weeks (mid July). I think we need to address this, or possibly roll it to the next release with sufficient time to communicate to affected maintainers. (Also, package nanotime was in that set and as I prepared a release for it yesterday it was not very clear what/how to change for it. It used zero-length arrays as a feature yet some of its unit tests balked and while there may be a out of index write hiding there are also other issues present.)

@Enchufa2
Copy link
Member

Then, let's make it a warning for now and an error in the next release?

@eddelbuettel
Copy link
Member

I ran into errors when I mod'ed it to 'continue' rather than stop().

@kevinushey Any insights from your end?

@eddelbuettel
Copy link
Member

eddelbuettel commented Jun 18, 2024

Thanks for the update, @kevinushey -- I will give this a spin in another reverse-dependency run.

Another thing I noticed was that if one where to define the 'off-switch' the code doesn't actually compile cleanly. I tried this diff

diff --git a/inst/include/Rcpp/vector/Subsetter.h b/inst/include/Rcpp/vector/Subsetter.h
index 0d8b32a6..0905e1a3 100644
--- a/inst/include/Rcpp/vector/Subsetter.h
+++ b/inst/include/Rcpp/vector/Subsetter.h
@@ -147,7 +147,7 @@ private:
     }
     #else
     template <typename IDX>
-    void check_indices(IDX* x, IDX n, IDX size) {}
+    void check_indices(IDX* x, R_xlen_t n, R_xlen_t size) {}
     #endif

     void get_indices( traits::identity< traits::int2type<INTSXP> > t ) {

Can you take a peek?

@kevinushey
Copy link
Contributor Author

Should be good now. We now give a warning rather than a hard error; I indicated in NEWS that this might change in the future.

@eddelbuettel
Copy link
Member

We now give a warning rather than a hard error

In one spot. You left a stop() in the other. On purpose?

@kevinushey
Copy link
Contributor Author

I left https://github.com/RcppCore/Rcpp/pull/1310/files#diff-4d30eca18cf4d872269fc011605a047e2212875dcb25688bd2460986c632064bL144 alone because that was pre-existing code (we already gave a hard error in those cases) but I can switch that to a warning if we want to be uniform?

@eddelbuettel
Copy link
Member

Oh, I completely missed that we already stopped there / that that bit was not new code. My bad!

@eddelbuettel
Copy link
Member

So I ran another reverse dependency check, and the result summary is here. It has seven failures of which one is due to missing build dependencies and at least one is recurrent leaving five. One is possibly local / spurious (connection issue), leaving four. CRAN may let me get away with arguing that the DATAPTR improvement is worth it.

@eddelbuettel
Copy link
Member

We are down to five now. One is possibly a test setup issue 1. One is a package for which CRAN requested other changes. I will open an issue here with links and try to get in touch with the maintainers -- but first things first: I will merge this, and then make a new incremental release 1.0.12.4 with this merged and make it available via the drat (and r-universe) so that the affected maintainers can test with it.

Footnotes

  1. Sometimes tests fail on my end and not at CRAN, here we have a package not getting on with RcppArmadillo on the spot compilation.

Copy link
Member

@eddelbuettel eddelbuettel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready now. Big thank you for preparing it!

@eddelbuettel eddelbuettel merged commit 5614a8b into master Jun 22, 2024
16 checks passed
@eddelbuettel eddelbuettel deleted the bugfix/r-vector-start-specializations branch June 22, 2024 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants