Skip to content

nested gradient with hessian #1264

Open
@YichengDWu

Description

@YichengDWu

Reverse on forward on reverse:

julia> function f1(x, ps)  # [edit: renamed not to clash]
       hess = Zygote.hessian(x->sum(x.^3), x)
       return hess * x .+ ps.bias
       end
f1 (generic function with 1 method)

julia> x = rand(3)
3-element Vector{Float64}:
 0.9750274052932691
 0.015723824729741764
 0.9792305251283961

julia> ps = (;bias = rand(3))
(bias = [0.6184575461887033, 0.24789621977449272, 0.5451996227584986],)

julia> f1(x,ps)
3-element Vector{Float64}:
 6.322528192626253
 0.24937965175928256
 6.298554150817905

julia> Zygote.gradient(p -> sum(f1(x,p)), ps)
ERROR: Mutating arrays is not supported -- called setindex!(Matrix{Float64}, ...)
This error occurs when you ask Zygote to differentiate operations that change
the elements of arrays in place (e.g. setting values with x .= ...)

Possible fixes:
- avoid mutating operations (preferred)
- or read the documentation and solutions for this error
  https://fluxml.ai/Zygote.jl/dev/limitations.html#Array-mutation-1

Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:33
  [2] _throw_mutation_error(f::Function, args::Matrix{Float64})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\array.jl:70
  [3] (::Zygote.var"#444#445"{Matrix{Float64}})(#unused#::Nothing)
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\array.jl:82
  [4] (::Zygote.var"#2496#back#446"{Zygote.var"#444#445"{Matrix{Float64}}})(Δ::Nothing)
    @ Zygote C:\Users\Luffy\.julia\packages\ZygoteRules\AIbCs\src\adjoint.jl:67
  [5] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:31 [inlined]
  [6] (::typeof((forward_jacobian)))(Δ::Tuple{Nothing, Matrix{Float64}})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
  [7] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:44 [inlined]
  [8] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:43 [inlined]
  [9] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\grad.jl:76 [inlined]
 [10] (::typeof((hessian_dual)))(Δ::Matrix{Float64})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
 [11] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\grad.jl:74 [inlined]
 [12] Pullback
    @ .\REPL[119]:2 [inlined]
 [13] (::typeof((f)))(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
 [14] Pullback
    @ .\REPL[123]:1 [inlined]
 [15] (::typeof((#97)))(Δ::Float64)
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
 [16] (::Zygote.var"#60#61"{typeof((#97))})(Δ::Float64)
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface.jl:41
 [17] gradient(f::Function, args::NamedTuple{(:bias,), Tuple{Vector{Float64}}})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface.jl:76
 [18] top-level scope
    @ REPL[123]:1

Forward on forward on reverse

julia> Zygote.forward_jacobian(p -> sum(f1(x,p)), ps)
ERROR: MethodError: no method matching size(::NamedTuple{(:bias,), Tuple{Vector{Float64}}})
Closest candidates are:
  size(::Union{LinearAlgebra.QR, LinearAlgebra.QRCompactWY, LinearAlgebra.QRPivoted}) at C:\Users\Luffy\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\qr.jl:567
  size(::Union{LinearAlgebra.QR, LinearAlgebra.QRCompactWY, LinearAlgebra.QRPivoted}, ::Integer) at C:\Users\Luffy\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\qr.jl:566
  size(::Union{LinearAlgebra.Cholesky, LinearAlgebra.CholeskyPivoted}) at C:\Users\Luffy\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\cholesky.jl:494
  ...
Stacktrace:
 [1] seed(x::NamedTuple{(:bias,), Tuple{Vector{Float64}}}, ::Val{1}, offset::Int64) (repeats 2 times)
   @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:7
 [2] forward_jacobian(f::var"#99#100", x::NamedTuple{(:bias,), Tuple{Vector{Float64}}}, #unused#::Val{1})
   @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:29
 [3] forward_jacobian(f::Function, x::NamedTuple{(:bias,), Tuple{Vector{Float64}}}; chunk_threshold::Int64)
   @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:44
 [4] forward_jacobian(f::Function, x::NamedTuple{(:bias,), Tuple{Vector{Float64}}})
   @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:43
 [5] top-level scope
   @ REPL[124]:1

It would be great to support the second mode. Looks like it won't take too much to achieve that. If I change ps to a vector it can work smoothly.

julia> function f2(x, bias)
       hess = Zygote.hessian(x->sum(x.^3), x)
       return hess * x .+ bias
       end
f2 (generic function with 1 method)

julia> Zygote.forward_jacobian(p -> sum(f2(x,p)), rand(3))
(13.320223875130782, [1.0; 1.0; 1.0;;])

Metadata

Metadata

Assignees

No one assigned

    Labels

    second orderzygote over zygote, or otherwise

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions