部署你自己的LLaMA 3.2 API,兼容OpenAI模式,通过@静楠墨筠感谢@Berry Bubble和@折杨柳垂杨浮绿水,更新我们的LLaMA 3.1示例非常简单代码: https://github.com/modal-labs/modal-examples/blob/main/06_gpu_and_ml/llm-serving/vllm_inference.py