Skip to content

DataLoader slowdown throughout iterations with parallel=true #148

@jeremiedb

Description

@jeremiedb

DataLoader performance degrades through iterations when parallel=true.

Following is a MWE illustrating the issue:

using Images
using StatsBase: sample, shuffle
using DataAugmentation
using Flux
using TestImages
import Base: length, getindex
using BenchmarkTools

const im_size = (224, 224)
imgs = rand(["chelsea", "coffee"], 1000)

struct ImageContainer{T<:Vector}
    img::T
end

length(data::ImageContainer) = length(data.img)
tfm = DataAugmentation.compose(ScaleKeepAspect(im_size))

function getindex(data::ImageContainer, idx::Int)
    path = data.img[idx]
    _img = testimage(path)
    _img = apply(tfm, Image(_img))
    img = collect(channelview(float32.(itemdata(_img))))
    return img
end

data = ImageContainer(imgs)
deval =
    Flux.DataLoader(data, batchsize = 32, parallel = true, collate = true, partial = true)

function data_loop(data)
    count = 0
    for x in data
        count += last(size(x))
    end
    return nothing
end

for i = 1:20
    @time data_loop(deval)
end

If lauched with julia --project=@. --threads=8 .\data-test.jl:

9.125682 seconds (23.69 M allocations: 5.049 GiB, 7.29% gc time, 73.35% compilation time: 10% of which was recompilation)
  2.291844 seconds (1.80 M allocations: 3.907 GiB, 36.56% gc time)
  1.763683 seconds (1.80 M allocations: 3.907 GiB, 17.21% gc time)
  1.754396 seconds (1.80 M allocations: 3.907 GiB, 16.14% gc time)
  1.856838 seconds (1.80 M allocations: 3.907 GiB, 19.07% gc time)
  1.771068 seconds (1.80 M allocations: 3.907 GiB, 13.99% gc time)
  1.916222 seconds (1.80 M allocations: 3.907 GiB, 18.39% gc time)
  1.957732 seconds (1.80 M allocations: 3.907 GiB, 13.32% gc time)
  2.336840 seconds (1.80 M allocations: 3.907 GiB, 14.05% gc time)
  2.433143 seconds (1.80 M allocations: 3.907 GiB, 15.70% gc time)
  2.524151 seconds (1.80 M allocations: 3.907 GiB, 12.57% gc time)
  2.289293 seconds (1.80 M allocations: 3.907 GiB, 10.86% gc time)
  2.628618 seconds (1.80 M allocations: 3.907 GiB, 14.58% gc time)
  2.495031 seconds (1.80 M allocations: 3.907 GiB, 11.85% gc time)
  2.509507 seconds (1.80 M allocations: 3.907 GiB, 13.24% gc time)
  2.513830 seconds (1.80 M allocations: 3.907 GiB, 11.68% gc time)
  2.531557 seconds (1.80 M allocations: 3.907 GiB, 13.33% gc time)
  2.514478 seconds (1.80 M allocations: 3.907 GiB, 10.86% gc time)
  2.724194 seconds (1.80 M allocations: 3.907 GiB, 12.45% gc time)
  2.689008 seconds (1.80 M allocations: 3.907 GiB, 13.42% gc time)
  2.399 s (1801661 allocations: 3.91 GiB)
  2.383 s (1801654 allocations: 3.91 GiB)
  2.513 s (1801675 allocations: 3.91 GiB)

If parallel=false, then performance remains stable throughout the iterations:

 13.533416 seconds (19.77 M allocations: 4.849 GiB, 4.09% gc time, 39.94% compilation time: 12% of which was recompilation)
  7.463501 seconds (1.80 M allocations: 3.910 GiB, 4.21% gc time)
  7.458656 seconds (1.80 M allocations: 3.910 GiB, 4.08% gc time)
  7.461299 seconds (1.80 M allocations: 3.910 GiB, 4.06% gc time)
  7.568510 seconds (1.80 M allocations: 3.910 GiB, 4.06% gc time)
  7.491400 seconds (1.80 M allocations: 3.910 GiB, 4.02% gc time)
  7.683299 seconds (1.80 M allocations: 3.910 GiB, 3.97% gc time)
  7.523333 seconds (1.80 M allocations: 3.910 GiB, 4.21% gc time)
  7.472339 seconds (1.80 M allocations: 3.910 GiB, 4.00% gc time)
  7.545891 seconds (1.80 M allocations: 3.910 GiB, 4.04% gc time)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions