289,232 questions
2
votes
2
answers
106
views
Pandas dataFrame standard deviation issue
I have this details.
import pandas as pd
import numpy as np
# Sample dataset
data = {
"date": pd.date_range("2025-10-01", periods=7),
"sales": [200, 220, 250, np....
-1
votes
1
answer
178
views
Different Date Formats In Same Column [closed]
I've a dataset having a column: Joining_date with different date formats. I want to convert them in a one same date format
df['Joining_Date']
0 2016-10-26
2 16/08/2018
3 2017/06/...
1
vote
2
answers
240
views
How to make index into exponential notation?
I've tried to put in the NumPy scientific notation function in different places but that didn't work, other than that I haven't been able to find anything that works
def optimal_resistor(name="&...
1
vote
1
answer
134
views
How to transfer Python variables from one Flask view to another?
I am working on a Flask app constructed as follows:
The starting HTML template, controlled by a view, displays a text box, where the user can enter a SQL request to request a SQLite database ...
0
votes
3
answers
224
views
Efficient way to extract substrings (like date and strike price) from filenames in Pandas
I’m working on a project where I process a large dataset of CSV files. Each file has a name like this:
NATURALGAS21FEB25270CE.MCX and here i want to extract the following things -:
21FEB25 -DATE
270-...
1
vote
0
answers
108
views
Is there a way to complement / supplement a stubs package with additional type information in a stubs file?
I am using pyright in strict mode and I would like to avoid littering my code with exceptions telling pyright to ignore some line. I have been using my own stub file to great success in the case of ...
2
votes
2
answers
234
views
Use-case for interpolation (pandas) and linear space (numpy)
What is the difference between using an interpolation in the pandas package and the linear space of the numpy package? For instance in the following:
import numpy as np
import pandas as pd
In [12]: np....
0
votes
1
answer
68
views
I can not get the storms to appear on graph and it is not plotting right
This is what I have to do The police department has requested the creation of a Storm and Crime Data Report (SCDR) using historical data from the city of Miami for the calendar year 2024. Miami police ...
0
votes
1
answer
114
views
When should I use .copy() after filtering a DataFrame in pandas? [closed]
I am working on a project where I frequently filter my DataFrame and then return it for further processing. For example, in one file I have this code:
df = df[df['Ticker'].str.startswith("...
0
votes
3
answers
85
views
Find percentage of grouped variable in a Dataframe in pandas [closed]
I have a dataframe where column 1 has values like 'A', 'B' etc (categorical variable) and column 2 (also a nominal categorical variable) has values like 'city 1', 'city 2' etc. I want to group by ...
0
votes
0
answers
76
views
How can I calculate XIRR from irregular cash flows in Python and validate results against Excel’s XIRR function?
I am working on calculating the Extended Internal Rate of Return (XIRR) for a series of irregular cash flows using Python. The cash flows occur on uneven dates, and I want the result to match Excel’s ...
-2
votes
1
answer
103
views
What is the best way in pandas (or Python) to “explode” the list of cities into separate rows? [duplicate]
I am learning data science and working with JSON data.
Suppose I have a JSON file with provinces and their populations, but each province also has a list of cities. For example:
When I load this into ...
0
votes
0
answers
27
views
Python display and count unique elements from a dataset [duplicate]
I have a dataset populated from an API call to Splunk.
The dataset contains the following:
time
destip
destport
transport
2025-09-17 22:03:09
172.16.5.1
53
UDP
2025-09-17 22:03:10
172.16.5.1
53
UDP
...
2
votes
1
answer
178
views
Python to copy a sheet from one Excel file to another, sheet names are the same
Using Python, I am trying to copy the contents of an Excel sheet, to another Excel file, with the same sheet name. The Pandas code to write to the destination sheet seems to overwrite the whole ...
0
votes
1
answer
124
views
How to export a Jupyter Notebook with plots and rich formatting to PDF without losing formatting?
I am using Jupyter Notebook inside an Anaconda virtual environment and I have a notebook with multiple plots using matplotlib and pandas DataFrames with styled formatting.
When I try to export the ...
0
votes
0
answers
43
views
Databricks Cleanroom: Does Invited Collaborator have capability to adjust Timeout Error Timer?
I am running tests on a notebook from within Databricks' "Cleanroom" environment (therefore it is run on serverless compute managed by Databricks).
Im running into the following timeout ...
0
votes
1
answer
115
views
Overwrite Excel/CSV with pandas while file is open
Is it possible to overwrite and save a DataFrame to an Excel or CSV file if that file is already open?
I tried using df.to_excel("file.xlsx", index=False) to overwrite an existing Excel file ...
2
votes
1
answer
83
views
How to set background seaborn style in a multi-plot figure with subplot()
I have a multi-plot bar plot figure produced with matplotlib/ seaborn and I'd like to control the tick lines and background style. When I try to use sns.set_style("whitegrid"), the ...
0
votes
0
answers
50
views
How can I rename a duplicated column to make it unique in pandas? [duplicate]
I have a dataframe with two duplicated columns, that can be in variable position.
In the example, COLUMN1 and COLUMN4 are optionals.
COLUMN1 CLASIFICATION CLASIFICATION COLUMN4
or
CLASIFICATION ...
0
votes
2
answers
83
views
Tiingo Python Pandas Datetime
I am grabbing OHLC stock price data from tiingo using the following code:
df = get_tiingo_data(ticker, start_date="1990-01-01", end_date="2025-12-31")
Which calls this function:
...
1
vote
0
answers
56
views
Some pieces are missing in my Python First Fit Decreasing bar cutting algorithm
I'm implementing a bar cutting optimization in Python using the First Fit Decreasing (FFD) algorithm. I have a list of orders, each with a quantity and length. The code should allocate all pieces into ...
0
votes
1
answer
97
views
Tring to get data from yfinance for timeframe for 2 days, but getting an error when aggreagating the data
I'm trying to get some data with yfinance, but I'm trying to get the timeframe for 2 days, which doesn't exist by default for yfinance. So, I tried to agreggate 1 day + 1 day. But I'm getting an error ...
1
vote
2
answers
103
views
Convert dataframe of boolean values to list of ranges where value is True. (eg. output: "column-header, 24-79")
My goal is to compare the contents of two 2-d arrays (~15k rows, 8 columns) and get a list of the rows/columns where the comparison (greater-than) is true. I need that output in a readable format, ...
3
votes
2
answers
114
views
Saving a figure with the name of passed dataframe
I have a function that receives a pandas dataframe as an argument, plot it, and save the generated figure with the name of passed dataframe. .
For instance this is the function:
def plot_function(df):
...
2
votes
1
answer
95
views
Sort each row of a pandas column consisting of delimited strings
CONTEXT
Let's say I have df, which consists of a column of delimited strings that I would like to sort.
import pandas as pd
df = pd.DataFrame({'sort_me':['foo; bar','foo; bar','bar; foo']})
df
...
4
votes
2
answers
126
views
Adding subtotals to a pandas dataframe
I was looking to add subtotals to a pandas dataframe - a question which I found to be asked here often. The answers making use of the deprecated pd.append aren't relevant anymore so I figured a more ...
1
vote
2
answers
114
views
Conditionally updating values in the same dataframe, getting ValueError: Can only compare identically-labeled Series objects
I'm trying to update a DataFrame value in place and unable to do so. I've tried several different methods, none of which have worked. Looking up information online only gets me examples of comparing ...
2
votes
1
answer
59
views
Making matplotlib boxplot include columns with NaN values
I've been trying to plot a data frame as a box plot using matplotlib. My data frame looks something like this:
9-1 9-2 9-3 9-4 9-5
0 23 16.0 18.0 18.0 26
1 27 18.0 20.0 17.0 33
...
0
votes
2
answers
92
views
Merging Pandas dataframes on column combinations
Scenario: I am trying to merge 2 pandas dataframes. DF1 has the bulk data, and DF2 is a sort of mapping. Based on the combination of the values of 3 different columns, I want to put a column from DF2 ...
0
votes
1
answer
42
views
ModuleNotFoundError: No module named 'pandas._config'
I am trying to run Python project with Pandas as a dependency, installed with Poetry.
name : pandas
version : 2.3.2 ...
1
vote
1
answer
78
views
Is there a function to filter a dataframe by a column value and produce a dictionary?
In python pandas, I'm wondering if there is a builtin function that does the same as df_to_dict below. Speed is of the essence, as my dataframe can have thousands of rows. Basically return a ...
2
votes
1
answer
167
views
Why does Pandera print failing rows with pa.check() and a lambda function but not on a column check?
New to using Pandera. I want it to print the record(s) that fail the check. This is the simple check I want, fail when the system capacity is over 500:
import pandera.pandas as pa
import pandas as pd
...
4
votes
4
answers
207
views
Get rows with unique value in a specific column in pandas
Following is my data frame.
id name class
--------------------------
0 Nick a
1 Jane b
2 Jacon a
3 Jack b
4 Cooze a
-----...
0
votes
2
answers
120
views
How to set a cell value based on the time range in DatetimeIndex in Pandas
I am trying to create a new column in a pandas dataframe, where the values are based on ranges of the time (hours and minute) of the DatetimeIndex.
Here is my dataframe:
DatetimeIndex ...
0
votes
0
answers
65
views
apply function versus vectorised operation in pandas dataframe [duplicate]
I am working with a DataFrame of almost 1M rows and want to compute a column as a function of two others. My first idea was to use .apply(axis=1) with a lambda function to do the operation, but it was ...
3
votes
1
answer
73
views
Plotly chloropleth of India not showing states properly (too small)
I am trying to create a choropleth of India that shows railway accident data. When I try to make run it a choropleth is created, but the states of India are too small and do not reflect their real ...
0
votes
1
answer
115
views
How do I find top Values based on different columns?
i am using a dataset of video game sales and i am trying to make a pie chart of the top 10 publishers based on global_sales
right now i have:
data['Publisher'].value_counts().head(10).plot.pie(autopct=...
0
votes
1
answer
77
views
How to retrieve pandas dataframe agent-generated file?
I have a pandas dataframe agent deployed in an Azure FastAPI app service.
agent = create_pandas_dataframe_agent(
llm,
df,
verbose=True,
...
0
votes
0
answers
31
views
PVLib Soiling Models: Discrepancy between Kimber and HSU Models
I am using PVLib modeling to estimate soiling for my project.
I am checking both HSU and Kimber models to compare and identify the best result.
The problem is, I would expect outputs to be somewhat ...
2
votes
2
answers
141
views
Best way to assign a scalar to a new DataFrame column with a specific dtype
I am writing a routine to load a large dataset into a Pandas DataFrame from a bespoke text format.
As part of this process, I need to add new columns to a DataFrame. Sometimes I need to broadcast a ...
1
vote
2
answers
101
views
How to find columns not matching in Pandas Merge?
I'm performing data validation in Python using the Pandas module. I have two datasets to compare source and target data for expected values. I've successfully merged two dataframes using pd.merge and ...
-4
votes
2
answers
152
views
Extract values from each row corresponding to a column and add it in a single row [closed]
I have a dataframe with multiple rows that can be combined to a single row. I'm not sure how to do it.
Input DataFrame:
Emp#
Name
Week1
Week2
Week3
Week4
1
mary
45
0
0
0
1
mary
0
45
0
0
1
mary
0
0
63
...
2
votes
1
answer
78
views
Change the decimal value of each value in each column to 0.5 while maintaining the same leading integer python pandas
CONTEXT
I am NOT trying to round to the nearest 0.5. I know there are questions on here that address that. Rather, I am trying to change the decimal value of each value in each row to 0.5 while ...
1
vote
2
answers
99
views
Remove items within pandas DataFrameGroupBy groups
I have a dataframe df made up of n columns which are groups and one, "data". This dataframe is then grouped on the n group columns.
df = pd.DataFrame(data={"g0": ["foo", ...
1
vote
0
answers
84
views
Plot datetimeindex.time data in Mathplotlib
I did find a couple of threads related to similar topics, but the use case is always slightly different, as the goal often is just to format the x-axis of a plot or similar. So I am opening this new ...
2
votes
1
answer
154
views
Manipulating a large dataframe most efficiently
Imagine I have this dataframe called temp:
temp = pd.DataFrame(index = [x for x in range(0, 10)], columns = list('abcd'))
for row in temp.index:
temp.loc[row] = default_rng().choice(10, size=4,...
-1
votes
3
answers
245
views
Unable to scrape 2nd table from Fbref.com for players table
I would like to scrape the 2nd table in the page seen below from the link - https://fbref.com/en/comps/9/2023-2024/stats/2023-2024-Premier-League-Stats on google collab. But pd.read_html only gives me ...
2
votes
1
answer
89
views
Using a column value to find the Column header name in Pandas
Scenario: I have a pandas dataframe. I am trying to use the values in a given column (year) to find the relevant header name and add it to a new column (year_name). For example, if the dataframe looks ...
2
votes
1
answer
270
views
How do I capture missing nan values from Pandas 2.3.0 using Pydantic 2.11.7
Prerequisites:
Python 3.11.7
Pandas 2.3.0
Numpy 2.1.3
Pydantic 2.11.7
In the pandas documentation, it states that missing values for numeric data types are filled in with numpy.nan:
https://pandas....
6
votes
2
answers
390
views
Pandas does not fail, warn, or skip when rows have more columns than the header
I'm new to Python and to Pandas, and I am desperately trying to understand how or why this is happening.
I have a CSV file with some data, which has some rows which have extra commas , which are not ...