Hello, what a great project! Unfortunately, vLLM fix for Pascal arch no longer works on the main branch.
vLLM changed the way it checks for compute capability, but I was unable to find how it's done in the current version
Would you be able to refactor the Pascal patch to make TinyLLM work with recent version of vLLM? Would much oblige!