Open
Description
Currently, the dashboard starts to slow down with datasets at about 1.6MB, this becomes more pronounced by 3.3MB (CSV with about 12,000 rows by 15 columns).
The slow down is most pronounced in
- Initial processing of the dataset (hence we have a loading indicator).
- Selecting different color-by options for the map distribution; the change in the map has a pronounced delay.
Diagnosing the specific source of the slow down (if it's one or more reasons) is still required.
The Dash documentation provides various suggestions under their performance section, and the dash-extensions package also has some options:
- Flask caching. Since the processing is only done once, this would only have potential to speed up the graphing components, though initial testing didn't seem to show improvement.
a. A similar solution is background caching.
b. Dash-extensions has a package for server-side caching: ServersideOutput Transform. This removes the need for JSON serialization between callbacks and should speed things up, though I'm not sure how effective it was (I tried an implementation on the server-store branch). This would also require a regular clean-up of the cached files (as with the previous two). - Dash Patch Class for updating the map (and potentially other graphs as well). Plotly express is fast, but starts to slow down around 15K points (per dash performance docs). I attempted to implement this with the map, but my initial attempt was unsuccessful.
- Using orjson. I'm unclear on precisely how this potential solution works, it seems (and this) that having the package installed allows dash to serialize json strings with orjson instead. There are errors thrown if you try to use it for the serialization (the speedup comes from the fact that it's bytes).
- Clientside Callbacks. To implement this, the graphing portion would have to be translated into Javascript.