Skip to content

load data from parquet, poor performance for big datasets  #3396

@aceforeverd

Description

@aceforeverd

parquet file is 3.5G, and possibly 35G in memory. load data from parquet into online storage, spark bootstrap the job, starting 30min and fails of OOM.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions