Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import pyarrow as pa
decimal_type = pd.ArrowDtype(pa.decimal128(3, scale=2))
series = pd.Series([1, None], dtype=decimal_type)
pd.to_numeric(series, errors="coerce")
Issue Description
pandas.to_numeric
fails to coerce Pyarrow Decimal series that contain NA values due to those NA values getting dropped, leading to an index mismatch:
import pandas as pd
import pyarrow as pa
decimal_type = pd.ArrowDtype(pa.decimal128(3, scale=2))
series = pd.Series([1, None], dtype=decimal_type)
pd.to_numeric(series, errors="coerce")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[13], line 8
4 decimal_type = pd.ArrowDtype(pa.decimal128(3, scale=2))
6 series = pd.Series([1, None], dtype=decimal_type)
----> 8 pd.to_numeric(series, errors="coerce")
File /opt/homebrew/lib/python3.13/site-packages/pandas/core/tools/numeric.py:319, in to_numeric(arg, errors, downcast, dtype_backend)
316 values = ArrowExtensionArray(values.__arrow_array__())
318 if is_series:
--> 319 return arg._constructor(values, index=arg.index, name=arg.name)
320 elif is_index:
321 # because we want to coerce to numeric if possible,
322 # do not use _shallow_copy
323 from pandas import Index
File /opt/homebrew/lib/python3.13/site-packages/pandas/core/series.py:575, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
573 index = default_index(len(data))
574 elif is_list_like(data):
--> 575 com.require_length_match(data, index)
577 # create/copy the manager
578 if isinstance(data, (SingleBlockManager, SingleArrayManager)):
File /opt/homebrew/lib/python3.13/site-packages/pandas/core/common.py:573, in require_length_match(data, index)
569 """
570 Check the length of data matches the length of the index.
571 """
572 if len(data) != len(index):
--> 573 raise ValueError(
574 "Length of values "
575 f"({len(data)}) "
576 "does not match length of index "
577 f"({len(index)})"
578 )
ValueError: Length of values (1) does not match length of index (2)
This seems to be due to this conversion to a numpy type setting the dtype to object
, which causes this condition to be false, which skips re-adding the NA values, leading to a final values
array shorter than the original index.
Expected Behavior
I'd expect the series to get converted (to values of decimal.Decimal
type, with dtype=object) without raising an exception, preserving the null elements.
Installed Versions
pandas : 2.2.3
numpy : 2.2.2
pytz : 2025.1
dateutil : 2.9.0.post0
pip : 25.0
Cython : None
sphinx : None
IPython : 8.32.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.4
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2025.2.0
html5lib : None
hypothesis : 6.125.2
gcsfs : None
jinja2 : 3.1.5
lxml.etree : None
matplotlib : 3.10.3
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 19.0.0
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.15.2
sqlalchemy : 2.0.38
tables : None
tabulate : None
xarray : 2025.1.2
xlrd : None
xlsxwriter : None
zstandard : 0.23.0
tzdata : 2025.1
qtpy : None
pyqt5 : None