Home

Welcome to the DataFrame wiki (2018)

At this point I am more concerned about getting my interface the best I can and then worry about implementation efficiency. But I still want to establish some implementation guidelines:

All data should be stored in continuous memory space rather than containers of pointers.
We should be able to store any data type and any number of columns.
We should avoid class derivations and object-oriented polymorphism, as much as possible. We should instead have template polymorphism.
We should use move instead of copy, as much as possible
Integrate using of async() and futures
Use internal multi-threading

To achieve the above principals and considering the fact that C++ is a statically typed language; the interface will be somewhat more bloated than Pandas. It will be a bit cluttered with type specifications .

As to why I am doing this and where the need is; I see a couple of good reasons:

Python cannot handle large amount of data, especially when it comes to financial intraday (i.e. tick-by-tick) data. So if you want a package to do statistical analysis on intraday data, you need an efficient environment, such as a compiled program.
In medium to high frequency trading, these days, the research is usually done in Python (sometimes using Pandas), but the execution is usually done in C++. So there is a translation phase in between that is often a source of trouble.
It is fun to do

Update (2025)

Now that I am looking at the guidelines I originally set at the beginning of this endeavor, I see that I am not far off the mark after 7 years. The library is almost not recognizable from what it was 7 years ago. But I am glad to say that I stuck with the original guidelines for 7 years and I still believe they were the right ones.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Home

Welcome to the DataFrame wiki (2018)

Update (2025)

Clone this wiki locally