TL;DR
There are a collection of features which work together to improve compile times
Goal(s)
We want to reduce the time it takes to compile models. Some parts are out of our control (dynamo graph capture) but we can improve on the TensorRT side.
Tasks
### Tasks
- [ ] https://github.com/pytorch/TensorRT/issues/2674
- [ ] Weight Refit
- [ ] Engine Caching
Additional context