-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Code Sample, a copy-pastable example if possible
In [62]: df = pd.DataFrame([(float(x) for x in range(0, 10)), (float(x) for x in range(10,20))])
In [63]: df
Out[63]:
0 1 2 3 4 5 6 7 8 9
0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
1 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0
In [64]: df[0]
Out[64]:
0 0.0
1 10.0
Name: 0, dtype: float64
In [65]: df[0].astype(int)
Out[65]:
0 0
1 10
Name: 0, dtype: int64
In [66]: df[0] = df[0].astype(int)
In [67]: df
Out[67]:
0 1 2 3 4 5 6 7 8 9
0 0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
1 10 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0
In [68]: df.iloc[0]
Out[68]:
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0
6 6.0
7 7.0
8 8.0
9 9.0
Name: 0, dtype: float64Problem description
After I reassign the 0th column as int, I expect it to be int, and it appears to be that way. But when I do a .iloc on the dataframe, it seems to be returning back into being a float somehow!
This is the narrowed-down version of a more insidious problem where instead of doing an iloc[] I was running a .apply(f) on a dataframe and the dtypes of the resulting dataframe were all messed up even when the function f wasn't doing anything discernible with the types, so I narrowed it down to this.
Current workaround is to re-cast all the types in f, but that can get frustrating real quick depending on the number of columns.
Expected Output
I expect the row to be a mixed dtype object, with the dtype of each cell matching that of the column:
In [68]: df.iloc[0]
Out[68]:
0 0
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0
6 6.0
7 7.0
8 8.0
9 9.0
Name: 0, dtype: object
Output of pd.show_versions()
Details
INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: 1.5.1
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: 0.0.9
pandas_gbq: None
pandas_datareader: None