Conversation
| public func loraLinearLayers() -> MLXLMCommon.LoRALinearLayers { | ||
| // TODO ??? | ||
| return [] | ||
| } |
There was a problem hiding this comment.
I wasn't sure what to do for this.
There was a problem hiding this comment.
Normally the q and v projection layers from attention:
public func loraLinearLayers() -> LoRALinearLayers {
model.layers.map { ($0.attention, ["q_proj", "v_proj"]) }
}but this doesn't seem to have an Attention layer. It works with any linear layers, so perhaps the x_proj and dt_proj layers in MambaBlock?
Otherwise maybe just remove the method. Also, FWIW this type would need to conform to LoRAModel for this to work.
|
|
||
| import Foundation | ||
| import MLX | ||
| import MLXFast |
There was a problem hiding this comment.
This is included in MLX now -- you can remove this import.
|
|
||
| // port of https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/models/mamba.py | ||
|
|
||
| struct StringKey: CodingKey, ExpressibleByStringLiteral { |
There was a problem hiding this comment.
You don't need this, see below.
There was a problem hiding this comment.
And if you have to keep it, please make it private
| try container | ||
| .decodeIfPresent(Int.self, forKey: .hiddenSize) | ||
| ?? fallback | ||
| .decode(Int.self, forKey: "d_model") |
There was a problem hiding this comment.
You can do this with:
enum CodingKeys: String, CodingKey {
case modelType = "model_type"
case vocabSize = "vocab_size"
case hiddenSize = "hidden_size"
case dModel = "d_model"then:
hiddenSize = try container.decodeIfPresent(Int.self, forKey: .hiddenSize) ?? container.decode(Int.self, forKey: .dModel)#316 will also provide a good solution to this, but it isn't merged yet.
There was a problem hiding this comment.
I'll just wait for #316 and update the PR using the new macro.
No description provided.