Skip to content

Dataframe input for functions #39

Closed
@robwandrews

Description

@robwandrews

From a conversation started in #37 by @bmu :

As an idea, they could possibly also be used by users in the future, if we would switch to dataframe inputs for some functions or a different kind of api. This would enable us to include som e "magic" functions, e.g the signature of the ominous globalinplane function could look like this:

def globalinplane(df, surface_tilt, surface_azimuth, diffuse_model='perez', 
              decomposition_model=None):
"""Determine GPOA from either GHI and DHI or from GHI only

Parameters
---------------
df : pandas.DataFrame
      A DataFrame containing all necessary columns acoording to naming conventions
      (maybe surface_tilt and surface_azimuth could also be contained in the df, 
       e.g. for tracking systems.) 
decomposition_model : None or str
      The model to use if only GHI is given in the DataFrame
....
Returns
-------
The input DataFrame plus columns `direct tilted`, 'diffuse tilted`, `GPOA` ...
""""

This function could look which columns are in the DataFrame and compute all necessary columns (e.g. if time is given as local time, compute utc or true solar time).

From my experience this is usefull for beginners because there are quite a lot of simulation steps required to calculate GPOA from GHI (maybe convert times, decomposition in direct and diffuse, diffuse model, ground reflection, direct tilted, ...?).
And there are other options, e.g. claculate expected energy yield from only a system description, Location and a DataFrame containing GHI and ambient temperature.

This is just an idea, not sure about the difficulties. There may be some complexity when implementing something like this (more on the definitions, not necessarily from a programmers point of view) and maybe this is difficult to explain to users.

My veiw on this would be to stay away from datframe passing as inputs, especially as the only form of input. One of the advantages of pvlib is that there is the ability to use different inputs (irradiance sources, plane transposition models, etc.) and compare their outputs from the functions. This means that a user might have multiple versions of dni,ghi, pmp, etc. which they are wanting to use in the functions. Though it is possible to repackage a new dataframe for each time a variable is swapped out, this leads to extra unnecessary steps on the user side, and makes it harder to explicitly track what is being passed through a function. When I was originally making tools for myself, I did make them with dataframe inputs, and found that it was leading to too many hard to trace errors, and ended up switching it to explicit inputs.

It might be interesting to have df input as an optional input along with the explicit variables (which might have been what you were suggesting), but I wouldn't want to move completely to df input.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions