Skip to content

s3_list_keys() returns empty results despite objects existing in S3 #318

@hashemamireh

Description

@hashemamireh

s3_list_keys() returns empty results despite objects existing in S3

Description

s3_list_keys() returns an empty array when listing objects under certain prefixes, even though the objects exist and can be retrieved with s3_get(). The underlying S3 API (S3.list_objects_v2) correctly returns the objects with KeyCount=1, but s3_list_keys() fails to extract them.

Environment

  • Julia version: 1.11.7
  • AWSS3.jl version: v0.11.4
  • AWS.jl version: v1.96.0
  • OS: macOS

Minimal Reproducible Example

using AWSS3
using AWS
using AWS: @service
@service S3

aws = global_aws_config(; region="us-east-1")
bucket = "my-bucket"
prefix = "path/to/partition/marketplace_id=1/order_day=2025-09-01"

# This file exists and can be retrieved
key = "path/to/partition/marketplace_id=1/order_day=2025-09-01/part-00000.snappy.parquet"
parquet_object = s3_get(aws, bucket, key)  # ✅ Works fine

# But listing returns empty
keys = collect(s3_list_keys(aws, bucket, prefix))
println("Found $(length(keys)) keys")  # ❌ Returns: Found 0 keys

# However, the raw S3 API shows the object exists
response = S3.list_objects_v2(bucket, Dict("prefix" => prefix, "max-keys" => "10"))
println(response)
# ✅ Returns: KeyCount => "1", Contents => {...}

Expected Behavior

s3_list_keys() should return the keys that exist under the specified prefix, matching what the underlying S3 API returns.

Actual Behavior

s3_list_keys() returns an empty array, even though:

  1. The S3 API response shows KeyCount=1 with valid Contents
  2. s3_get() successfully retrieves the object using the full key
  3. s3_list_objects() returns a Channel that contains the object metadata when iterated

Diagnostic Output

julia> keys = collect(s3_list_keys(aws, bucket, prefix))
Any[]

julia> response = S3.list_objects_v2(bucket, Dict("prefix" => prefix, "max-keys" => "10"))
OrderedCollections.LittleDict{Union{String, Symbol}, Any, Vector{Union{String, Symbol}}, Vector{Any}} with 6 entries:
  "Name"        => "my-bucket"
  "Prefix"      => "path/to/partition/marketplace_id=1/order_day=2025-09-01/"
  "KeyCount"    => "1"
  "MaxKeys"     => "10"
  "IsTruncated" => "false"
  "Contents"    => LittleDict{Union{String, Symbol}, Any, Vector{Union{String, Symbol}}, Vector{Any}}("Key"=>"path/to/partition/marketplace_id=1/order_day=2025-09-01/part-00000.snappy.parquet", ...)

Additional Context

Both s3_list_keys() and s3_list_objects() return empty results, even though the underlying S3 API confirms the objects exist:

julia> objects_channel = s3_list_objects(aws, bucket, prefix)
Channel{OrderedCollections.LittleDict}(128) (empty)

julia> keys = collect(s3_list_keys(aws, bucket, prefix))
Any[]

Workaround

Use the raw S3 API directly:

using AWS: @service
@service S3

response = S3.list_objects_v2(bucket, Dict("prefix" => prefix))
if haskey(response, "Contents")
    contents = response["Contents"]
    keys = isa(contents, Vector) ? [c["Key"] for c in contents] : [contents["Key"]]
end

Possible Cause

The issue may be related to how s3_list_keys() and s3_list_objects() parse the S3 API response when dealing with specific prefix patterns or when Contents contains a single object versus an array of objects. The Channel is created but never populated with data from the API response.

I will look into it more and try to push a bug fix when I get some down time but wanted to document the issue and flag it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions