Huge allocations

I'm currently in the process of figure out what library to use for getting metadata from media files and this library is definitly one of the fastes around. Only problem I have are the huge allocations it makes:

``` ini

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
Intel Core i7-7700K CPU 4.20GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.201
  [Host]     : .NET Core 5.0.4 (CoreCLR 5.0.421.11614, CoreFX 5.0.421.11614), X64 RyuJIT
  DefaultJob : .NET Core 5.0.4 (CoreCLR 5.0.421.11614, CoreFX 5.0.421.11614), X64 RyuJIT


```
|                     Method |               Folder |        Mean |     Error |    StdDev |         Min |         Max |      Gen 0 |      Gen 1 |      Gen 2 |    Allocated |
|--------------------------- |--------------------- |------------:|----------:|----------:|------------:|------------:|-----------:|-----------:|-----------:|-------------:|
| **ParseWithMetadataExtractor** | REDACTED | **563.2 ms** | **23.79 ms** | **15.74 ms** | **536.6 ms** | **580.6 ms** | **44000.0000** | **22000.0000** | **10000.0000** | **230725.6 KB** |
|         ParseWithMediaInfo | REDACTED | 771.5 ms | 29.93 ms | 19.80 ms | 745.8 ms | 801.8 ms |           - |           - |           - |     13.7 KB |

The folder I tested this on contained 144 files (136 Videos and 8 Images total 1GB) and saw allocations of around 220MB.

|                     Method |               Folder |        Mean |     Error |    StdDev |         Min |         Max |      Gen 0 |      Gen 1 |      Gen 2 |    Allocated |
|--------------------------- |--------------------- |------------:|----------:|----------:|------------:|------------:|-----------:|-----------:|-----------:|-------------:|
| **ParseWithMetadataExtractor** | REDACTED | **140.9 ms** |  **9.84 ms** |  **6.51 ms** | **128.5 ms** | **147.1 ms** | **18000.0000** | **14000.0000** | **8000.0000** | **65805.38 KB** |
|         ParseWithMediaInfo | REDACTED | 217.7 ms | 15.54 ms | 10.28 ms | 203.3 ms | 231.9 ms |           - |           - |           - |    26.01 KB |


The next benchmark was done on a folder containing 276 files (only images) and again we see allocations way above reason.

Using the [Dynamic Program Analysis](https://www.jetbrains.com/help/rider/Dynamic_Program_Analysis.html) build into Rider, the most allocations happen because the library reads the entire contents of a section into a byte array and often processes those later on like for PNG and JPEG.

Possible improvements could be made by using `Span<T>` and `.Slice` for the chunks which returns a `ReadOnlySpan<T>` with no extra allocations.

Then there is also the concept of binary overlays that differ from typical binary importating in that you do not read everything from file into memery upfront and then parse it but keep an open stream and only parse the bare minimum needed to know the file layout. With the layout you can then expose getters that call Lazy functions or similar which then jump to the specific position in the file stream and parse the section on demand instead of up front. This method is extremely useful as the program only needs to actually read and parse what you need so allocations will be kept to a minimum. The biggest problem this has it that it requires some not so small amount of refactoring and API changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge allocations #283

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Method	Folder	Mean	Error	StdDev	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
ParseWithMetadataExtractor	REDACTED	563.2 ms	23.79 ms	15.74 ms	536.6 ms	580.6 ms	44000.0000	22000.0000	10000.0000	230725.6 KB
ParseWithMediaInfo	REDACTED	771.5 ms	29.93 ms	19.80 ms	745.8 ms	801.8 ms	-	-	-	13.7 KB

Method	Folder	Mean	Error	StdDev	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
ParseWithMetadataExtractor	REDACTED	140.9 ms	9.84 ms	6.51 ms	128.5 ms	147.1 ms	18000.0000	14000.0000	8000.0000	65805.38 KB
ParseWithMediaInfo	REDACTED	217.7 ms	15.54 ms	10.28 ms	203.3 ms	231.9 ms	-	-	-	26.01 KB

Huge allocations #283

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions