恩-CSDN blink-领先的开发者技术社区

「LongLLaMA：模如其名，主打一个loooooong... 专门设计用来处理长上下文」支持高达256k 的上下文长度，该模型基于LLaMA 并使用 Focused Transformer (FoT) 方法进行微调。这种方法使模型能够更加集中地处理长文本，而不是平均分配注意力。HuggingFace：

https://huggingface.co/syzymon/long_llama_code_7b

https://colab.research.google.com/github/CStanKonrad/long_llama/blob/main/long_llama_colab.ipynb

https://github.com/CStanKonrad/long_llama

https://weibo.com/u/1627825392