Skip to content

Conversation

@ronsaldo
Copy link
Contributor

New file API initial version for review and integration. This is the same code in https://github.com/ronsaldo/pharo-newfile

@Ducasse
Copy link
Member

Ducasse commented Dec 10, 2025

Thanks Ronie!
Do you have some benchs to compare the current one with the new ones?

@ronsaldo
Copy link
Contributor Author

ronsaldo commented Dec 10, 2025

@Ducasse I do not have the benchmarks in Pharo yet. However, I do have the benchmarks in C at https://github.com/ronsaldo/fs-benchmarks

I have benchmark data in C for Windows, Linux and OS X at https://github.com/ronsaldo/fs-benchmarks/tree/main/benchs .

GitHub
Contribute to ronsaldo/fs-benchmarks development by creating an account on GitHub.
GitHub
Contribute to ronsaldo/fs-benchmarks development by creating an account on GitHub.

@jecisc
Copy link
Member

jecisc commented Dec 11, 2025

This new implementation is supposed to replace the File class?

@ronsaldo
Copy link
Contributor Author

Hello,

@jecisc This implementation adds a NewFilePlugin without touching the existent FilePlugin. In fact, to provide backward/forward compatibility I am calling the primitives in the File class from a new OldFile class. For using the NewFile class, the VM has to be compiled with the NewFilePlugin. To choose between them (OldFile and NewFile) I added the FileAPI and DirectoryAPI which selects the new file class if NewFilePlugin primitiveIsAvailable returns true. If the primitive does not return true, then the OldFile will be selected.

@Ducasse Benchmarks can be run by using the following script:

FileAPIBenchmarks runWithFileAPI: OldFile.
FileAPIBenchmarks runWithFileAPI: NewFile.

Here are the results for Windows:

OldFile Benchmarks
RandomDatasetGenerationTime 1899.0 ms
writeDataAtRandomOffsets speed (MB/s) 602.3529411764706
writeDataAtRandomOffsets speed (MB/s) 585.1428571428572
writeSmallData speed (MB/s) 305.6716417910448
writeSmallData speed (MB/s) 525.1282051282051
writeData speed (MB/s) 3508.7719298245615
writeData speed (MB/s) 3846.153846153846
readData speed (MB/s) 4545.454545454546
readData speed (MB/s) 4761.904761904762

NewFile Benchmarks
RandomDatasetGenerationTime 1857.0 ms
writeDataAtRandomOffsets speed (MB/s) 585.1428571428572
writeDataAtRandomOffsets speed (MB/s) 585.1428571428572
writeSmallData speed (MB/s) 435.74468085106383
writeSmallData speed (MB/s) 640.0
writeData speed (MB/s) 3703.703703703704
writeData speed (MB/s) 3921.5686274509803
readData speed (MB/s) 5000.0
readData speed (MB/s) 5128.205128205129

@Ducasse
Copy link
Member

Ducasse commented Dec 16, 2025

Thanks ronie I imagine that Pablo is following this. @tesonep

@Ducasse
Copy link
Member

Ducasse commented Dec 20, 2025

@ronsaldo do you have some bench on windows?
Could we know your bench protocol (and ecart type)
I see around 10 % in your benchs so may this is due to the fact that a platform may already have good support.

@ronsaldo
Copy link
Contributor Author

I refactored the benchmarks to compute multiple samples so that an average and a std can be obtained. For Linux, these are the benchmarks results that I am obtaining:

OldFile Benchmarks
RandomDatasetGenerationTime 1841.0 ms
writeDataAtRandomOffsets 100000 samples speed (MB/s): 681.9290012713232 +- 220.02983954446916
writeSmallData 100000 samples speed (MB/s): 687.5684723322828 +- 219.2142073172236
writeData 1000 samples speed (MB/s): 3586.315491792918 +- 458.615293287945
readData 1000 samples speed (MB/s): 10884.724060438857 +- 1374.242198863336

NewFile Benchmarks
RandomDatasetGenerationTime 1843.0 ms
writeDataAtRandomOffsets 100000 samples speed (MB/s): 1429.597808948928 +- 517.6167476138207
writeSmallData 100000 samples speed (MB/s): 1411.8949362592703 +- 514.9630706745908
writeData 1000 samples speed (MB/s): 3613.4586448298123 +- 414.8267985361467
readData 1000 samples speed (MB/s): 11273.545366257833 +- 1474.0870670733989

@ronsaldo
Copy link
Contributor Author

Here are some results on Windows:

OldFile Benchmarks
RandomDatasetGenerationTime 1906.0 ms
writeDataAtRandomOffsets 100000 samples speed (MB/s): 2041.330196480002 +- 116.62653038585267
writeSmallData 100000 samples speed (MB/s): 2041.3506560000023 +- 116.44810209473884
writeData 1000 samples speed (MB/s): 762238.0 +- 425646.148813777
readData 1000 samples speed (MB/s): 827173.0 +- 378058.102677353

NewFile Benchmarks
RandomDatasetGenerationTime 1856.0 ms
writeDataAtRandomOffsets 100000 samples speed (MB/s): 2041.5552512000013 +- 114.64834648698691
writeSmallData 100000 samples speed (MB/s): 2041.575710720002 +- 114.46679446326793
writeData 1000 samples speed (MB/s): 767233.0 +- 422530.81426092464
readData 1000 samples speed (MB/s): 837163.0 +- 369180.9434410178

I am getting zero in some time measurements. These results seem to suffer from an inaccurate microseconds clock.

@Ducasse
Copy link
Member

Ducasse commented Dec 21, 2025

Ok always thought that windows file got a problem but if your implementation is better it looks like the window primitives
that we have are already good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants