{ "cells": [ { "cell_type": "markdown", "id": "cef5239a", "metadata": {}, "source": [ "# pandas: to_numeric() for Safe Conversion of Strings to Numbers in Pandas\n", "`pandas.to_numeric(arg, errors='raise', downcast=None, dtype_backend=)`\n" ] }, { "cell_type": "markdown", "id": "76c53ae1", "metadata": {}, "source": [ "In pandas, `pandas.to_numeric()` allows you to convert a column or Series of strings to numeric values (integers or floats).\n", "- [pandas.to_numeric() - pandas 2.3.2 documentation](https://pandas.pydata.org/docs/reference/api/pandas.to_numeric.html)" ] }, { "cell_type": "markdown", "id": "17aa472c", "metadata": {}, "source": [ "Basic usage:" ] }, { "cell_type": "markdown", "id": "727b9564", "metadata": {}, "source": [ "```python\n", ">>> import pandas as pd\n", "```" ] }, { "cell_type": "markdown", "id": "9e4601c7", "metadata": {}, "source": [ "```python\n", ">>> s = pd.Series(['1.0', '2', -3])\n", ">>> pd.to_numeric(s)\n", "0 1.0\n", "1 2.0\n", "2 -3.0\n", "dtype: float64\n", ">>> pd.to_numeric(s, downcast='float')\n", "0 1.0\n", "1 2.0\n", "2 -3.0\n", "dtype: float32\n", ">>> pd.to_numeric(s, downcast='integer')\n", "0 1\n", "1 2\n", "2 -3\n", "dtype: int8\n", ">>> pd.to_numeric(s, downcast='signed')\n", "0 1\n", "1 2\n", "2 -3\n", "dtype: int8\n", ">>> pd.to_numeric(s, downcast='unsigned')\n", "0 1\n", "1 2\n", "2 3\n", "dtype: uint8\n", ">>> s = pd.Series(['apple', '1.0', '2', -3])\n", ">>> pd.to_numeric(s, errors='coerce')\n", "0 NaN\n", "1 1.0\n", "2 2.0\n", "3 -3.0\n", "dtype: float64\n", ">>> pd.to_numeric(s, errors='coerce').fillna(0)\n", "0 0.0\n", "1 1.0\n", "2 2.0\n", "3 -3.0\n", "dtype: float64" ] }, { "cell_type": "markdown", "id": "4cd955e5", "metadata": {}, "source": [ "For more information about nullable types, refer to the following article:\n", "- [pandas: nullable data types](https://cornel05.github.io/cornel.ai/notebooks/snippets/pandas/pandas_nullable_dtypes.html)\n", "\n", "Downcast to nullable (can be marked as NaN/) integer or float is supported:" ] }, { "cell_type": "markdown", "id": "b55fce85", "metadata": {}, "source": [ "```python\n", ">>> s = pd.Series([1, 2, 3, None], dtype=\"Int64\")\n", ">>> pd.to_numeric(s, downcast=\"integer\")\n", "0 1\n", "1 2\n", "2 3\n", "3 \n", "dtype: Int8\n", ">>> s = pd.Series([1.1, 2.2, 3.3, None], dtype=\"Float64\")\n", ">>> pd.to_numeric(s, downcast=\"float\")\n", "0 1.1\n", "1 2.2\n", "2 3.3\n", "3 \n", "dtype: Float32\n", "```" ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "-all", "formats": "ipynb,py:light", "main_language": "python" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 5 }