Skip to content
Hossein Moein edited this page Jul 13, 2025 · 4 revisions

Welcome to the DataFrame wiki (2018)

At this point I am more concerned about getting my interface the best I can and then worry about implementation efficiency. But I still want to establish some implementation guidelines:

  • All data should be stored in continuous memory space rather than containers of pointers.
  • We should be able to store any data type and any number of columns.
  • We should avoid class derivations and object-oriented polymorphism, as much as possible. We should instead have template polymorphism.
  • We should use move instead of copy, as much as possible
  • Integrate using of async() and futures
  • Use internal multi-threading

To achieve the above principals and considering the fact that C++ is a statically typed language; the interface will be somewhat more bloated than Pandas. It will be a bit cluttered with type specifications .

As to why I am doing this and where the need is; I see a couple of good reasons:

  1. Python cannot handle large amount of data, especially when it comes to financial intraday (i.e. tick-by-tick) data. So if you want a package to do statistical analysis on intraday data, you need an efficient environment, such as a compiled program.
  2. In medium to high frequency trading, these days, the research is usually done in Python (sometimes using Pandas), but the execution is usually done in C++. So there is a translation phase in between that is often a source of trouble.
  3. It is fun to do

Update (2025)

Now that I am looking at the guidelines I originally set at the beginning of this endeavor, I see that I am not far off the mark after 7 years. The library is almost not recognizable from what it was 7 years ago. But I am glad to say that I stuck with the original guidelines for 7 years and I still believe they were the right ones.

Clone this wiki locally