-
-
Notifications
You must be signed in to change notification settings - Fork 58
Description
For openzim/mwoffliner#2180 I had to analyze the ZIM content.
I did it with python-libzim
binding because I'm way more comfortable with it.
The struggle I had (which luckily was not blocker) is that while it is possible to have access to an Item
size (uncompressed AFAIK), I did not found any way to get its compressed size. It was hence hard to be 100% sure where the increased ZIM size went from.
Is that mostly normal since there is no such compressed size, because we only compress the cluster, not individual items? Or is it just something which is missing in the binding(s)? Should I have used another tool / zimtool to do this analysis?
At least having a rough estimation of compression factor for every item would help to analyze a bit deeper such situations. Maybe simply exposing clusters, and which cluster is used by which item, and every cluster compression factor (compressed and uncompressed size for instance) would be sufficient.