(附注:负载测试也在 Modal 上运行,代码可在以下链接查看:https://github.com/modal-labs/modal-examples/tree/main/06_gpu_and_ml/llm-serving/openai_compatible)