Conversation
* add scripts for gemm shared memory interference * split up shared_mem into llm and gemm subrepo * intra_sm/shared_mem/gemm v1
* membw to use main.py instead of gcontext_test.py * fix bugs related to num_requests and num_threads_per_tb * inter_sm/mem_bw verified * remove gcontext_test.py, new universal entrypoint is main.py * fix inter_sm/l2 to use L2Kernel * fix, missing set_percentage arg * fix num_warmup vs num_request * minor adjustment README ipc
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
socc-25 release