289,232 questions
0
votes
1
answer
56
views
Illegal instruction when importing pandas along with wxpython (python 3, Windows)
Trying to upgrade my python 2.7 scripts to the latest python 3.x for my PC running Windows 7-x64. To do so, I installed python 3.8.9.
I get a nasty error when I press ENTER after typing "import ...
3
votes
4
answers
194
views
Using multiple masks based on ID
I have a dataframe df that consists of two columns: an id, and a date. The id is a number from 1-3 & is not unique; the date is a datetime object.
id, date
1, 2020-5-11
1, 2019-3-2
2, 2018-7-29
3, ...
0
votes
1
answer
47
views
How to read LZO compressed pandas HDF5 files?
Here is the situation: I have data saved into pandas HDF5 files. Some data is compressed using lzo and some using blosc:zstd. Under RHEL-7, I was able to read both types of files. Then, I was ...
1
vote
1
answer
76
views
ModuleNotFoundError: No module named 'pandas' on Raspberry Pi even after installing from requirements.txt
I’m running a Python project on a Raspberry Pi (Debian / Raspberry Pi OS, Python 3.9.2).
The project has its own virtual environment (.venv), and I launch it with a script (alsv4) that starts multiple ...
6
votes
4
answers
250
views
Pandas continous time periods
Given a table:
id
cost
from
to
43
4
2025-09-01 01:00:00
2025-09-01 01:30:00
42
4
2025-09-01 01:30:00
2025-09-01 02:00:00
41
4.8
2025-09-01 02:00:00
2025-09-01 02:30:00
40
4.05
2025-09-01 02:30:00
2025-...
3
votes
2
answers
314
views
Can a DataFrame have multiple, different types in the same column?
I have a DataFrame, as shown below. In order to build it, I started with adding the numbers (100 for spec A Sum in 2020 and so on).
Additionally I add the median as a date.
2020 ...
1
vote
1
answer
82
views
Combine separate plots into one plot in Python
I have created the following pandas dataframe:
ds = {
'Date' : ['2025-08-22 16:00:00', '2025-08-22 16:01:00', '2025-08-22 16:02:00', '2025-08-22 16:03:00', '2025-08-22 16:04:00', '2025-08-...
-1
votes
1
answer
98
views
How to sum up a particular row of number inputs in a DataFrame [closed]
I want to add the inputs in a particular row of data frame as total. But the output is
XYZ: OBJECT1 Object2
NUMBERS: 1 2
Where the inputted XYZ are Object2 and OBJECT1 and the inputted Numbers are 1 ...
3
votes
3
answers
156
views
splitting a data frame in a way like dealing cards
I have a data frame of students sorted by grade that I want to split into 3 data frames, such that an even number of students per grade is in each of the 3 groups. I thought of it like dealing cards, ...
1
vote
1
answer
168
views
Sorting a column of max values from a multicolumn pandas data frame
I have a multi-column pandas data frame of years and corresponding cumulative rainfall values from 1 to 183 (October to March). That means in each column the last value is the maximum column value, ...
0
votes
0
answers
76
views
How can I recreate ewm(adjust=False).std() in Pandas?
This recreates ewm(adjust=True).std(): pandas ewm var and std, but I have no luck replicating the calculations in ewm(adjust=False).std(). Replicating ewm(False).mean() is easy but how is the bias ...
0
votes
2
answers
137
views
Pandas merge one-to-many [duplicate]
I'm trying to merge two pandas DataFrames on multiple columns. It is a many-to-one relationship. There are many of the same values in df1 but only value in df2.
These are the example DataFrames :
df1 =...
-2
votes
1
answer
65
views
drop.na code doesn't show up when using print(data_frame) [duplicate]
I wanna delete missing values from a certain column:
#deleting rows with missing values
data_excel.dropna(subset=['Budget Betrag'])
then I wanna check whether it's working with
print(data_excel)
But ...
2
votes
1
answer
84
views
How to do linear interpolation in PySpark without Pandas UDF (only using Spark API)?
I have a Spark DataFrame with the following structure:
shock_rule_id
DATE
value
A
2024-01-01
100
A
2024-01-02
null
A
2024-01-03
130
B
2024-01-01
50
B
2024-01-02
null
B
2024-01-03
null
B
2024-01-04
80
...
2
votes
2
answers
107
views
Pandas read_csv: Skip rows contains invalid data that can cause data_type parsing errors
The csv file can contain string values to certain integer columns and I want to ignore/handle via callback if that happens, tried using on_bad_lines='skip/warn' however it gets triggered only on ...
2
votes
2
answers
66
views
How to retrieve one data value from the result of a pandas DataFrame.groupby().mean()
Using Pandas 2.3.2 on Python 3.9.2 via JupyterLab.
I've collected a bunch of thermal data from a thing. I've already collated that data into DataFrame chunks that look like this:
zone data ...
3
votes
3
answers
106
views
Pandas read_csv, load empty/missing column values as NaN while loading empty string for quoted empty strings values in csv file
My csv file contains empty string "" as well as missing column values ,,. When i am loading with read_csv(), both are loaded as either empty string or NaN depending on keep_default_na and ...
2
votes
1
answer
147
views
What is NaT in Pandas?
I have a dataframe with some "NaT" values in a datetime column. What does that mean?
project status completed
0 windows done 2025-08-20
1 doors done 2025-08-21
2 hvac ...
0
votes
1
answer
68
views
Brier Skill Score returns NaN in cross_val_score with imbalanced dataset
I’m trying to evaluate classification models on a highly imbalanced fraud dataset using the Brier Skill Score (BSS) as the evaluation metric.
The dataset has ~2133 rows and the target Fraud_Flag is ...
3
votes
1
answer
75
views
How to pass argument to func in `pandas.resampler.agg()` when using dict input?
I am trying to resample a pandas dataframe, and for some columns I would like to sum on. additionally, I want to get None/nan as result when there is no rows in a resampling period. For aggregation on ...
1
vote
0
answers
137
views
Conversion of a pyspark DataFrame with a Variant column to pandas fails with an error
When I try to convert a pyspark DataFrame with a VariantType column to a pandas DataFrame, the conversion fails with an error 'NoneType' object is not iterable. Am I doing it incorrectly?
Sample code:
...
7
votes
3
answers
441
views
How to sort pandas groups by (multiple/all) values of the groups?
I am trying to do a somewhat complicated group and sort operation in pandas. I want to sort the groups by their values in ascending order, using successive values for tiebreaks as needed.
I have read ...
0
votes
2
answers
203
views
"Cannot set a DataFrame with multiple columns to the single column" when script is in function
I have a function which processes dataframe of 6 columns. It looks like this:
def Process_DF():
DF_6cols = "some data"
#Two functions to split column containing Column Val1 and ...
0
votes
1
answer
74
views
I am trying to create a separate data frame when specific conditions are met [duplicate]
I have a data frame that consists of a column that contains the Gender data. I want to segregate it by gender and create 2 separate dataframes.
I tried to do this by implementing the code below:
for i ...
2
votes
0
answers
248
views
Illegal instruction (core dumped) when running Streamlit app on Raspberry Pi 4
I’m trying to run a small Streamlit app on my Raspberry Pi 4.
For testing, I made a small version with just core functionality:
# main.py
import pandas as pd
import streamlit as st
def main():
st....
0
votes
2
answers
93
views
python basics: can someone help me understand processing one file vs processing all files at a time?
This will output all csv files from the directory, but only show one of the csv dataframes.
OUTPUT_PATH = "./static/output/"
FILE_LIST = glob.glob("./static/*.json")
def all_data():...
-2
votes
1
answer
234
views
pandas conert date failed - pd.to_datetime(df['xxx'], format='%Y-%m-%d').dt.date
I am facing one little problem. I am storing some date time data and the data is
#secCode,secName,announcementTitle,announcementId,announcementTime
003816,xxx name,2024report,1222913141,1743004800000
...
4
votes
1
answer
145
views
How do you remove already filtered category values from DataFrames from plots and pivot tables?
My dataframes show video game titles, platforms, year of release, revenue, etc.
I have filtered the original dataframe "df_samplegames", which has 29 different platforms (type category), ...
1
vote
0
answers
62
views
Pandas Styler.bar() not showing on Excel column
I'm working on a default style for some reports I have to do. I'd like to add the Styler.bar() method.
Sample of the dataset used for integration:
symbol,date,open,high,low,close,volume
AAL,2014-01-02,...
4
votes
4
answers
572
views
Find max value of a column, then find another value in the same row, and copy that value to a new column [closed]
I have the following frame:
lst = [
['SPXW 250715C06310000', '7/14/2025', 2.74, 2.87, 2.60, 2.65, 14, '8:30:00'],
['SPXW 250715C06310000', '7/14/2025', 2.80, 2.80, 2.50, 2.53, 61, '8:31:00'],
...
2
votes
1
answer
95
views
graph_objects.Surface axis tic spacing inconsistent on x and y
Given hemi.csv data of:
244,1000,1500,2000,2500,3000,3500,5000
0,14,18,-42,-72,-84,-86,-94,-119
12.5,277,231,185,139,144,150,161,158
25.1,416,394,370,348,361,374,404,396
37.6,483,587,633,653,566,585,...
3
votes
6
answers
282
views
removing rows that don't fit the repeating sequence in pandas dataframe
I have a pandas dataframe that looks like this:
A B C D
0 1 2 3 0
1 4 5 6 1
2 7 8 9 2
3 10 10 10 0
4 10 10 10 1
5 1 2 3 0
6 4 5 6 1
7 7 ...
3
votes
1
answer
105
views
Convert dictionary rows to new dataframe
After importing some nested JSON data, I'm trying to create a new dataframe from all of the dictionary key / value pairs in an existing column.
Starting point:
>>> df['schedules']
0 {'...
2
votes
3
answers
162
views
How to rewrite python code using Pyscript
My code works as python file but I am struggling to make it work using pyscript.I am sharing the code which I tried.
main.py
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"Tesseract-...
1
vote
1
answer
67
views
How to dynamically rename headers in a bank statement CSV/Excel using Python and Pandas?
I have bank statements in both Excel and CSV formats. The headers can vary slightly depending on the bank or the file export, for example:
TRAN_DATE, CHQNO, PARTICULARS, DR, CR, BAL, SOL
I want to ...
2
votes
1
answer
87
views
How to add value from a list as new column value in dataframe, if existing column value starts with that value from list
I have a dataframe which looks like shown below:
CALL_START IMSI
0 24.07.2025 12:00:51 123456888888888
1 24.07.2025 17:58:57 123456999999999
2 24.07.2025 17:05:47 ...
4
votes
2
answers
106
views
What to do when the pandas error position overflows?
So, I'm experimenting with pandas with the IMDB files, especially title.basic.tsv. When trying to parse the runtimeMinutes column to "Int64", I get an error
ValueError: Unable to parse ...
4
votes
1
answer
113
views
Find max/min value in a column in a range of data (multiindex) and append to a different column
I have the following dataframe:
import pandas as pd
import csv
lst = [['SPXW 250715C06310000', '7/14/2025', 2.74, 2.87, 2.60, 2.65, 14, '8:30:00'],
['SPXW 250715C06310000', '7/14/2025', 2.80, ...
2
votes
0
answers
54
views
broken x axis and broken dual y-axes - draw replot across subplots
My post relates to this one here:
Formatting a broken y axis in python matplotlib
I have borrowed code from this post and adapted it to what I am doing.
I am attempting to create a graph whereby I am ...
5
votes
3
answers
167
views
Pandas - return the -2 row
If I have an input.txt file:
apples grapes alpha pears
chicago paris london
yellow blue red
+++++++++++++++++++++
apples grapes beta pears
chicago paris london
car truck ...
8
votes
6
answers
558
views
Concatenating a range of rows in pandas
I have a pandas dataframe like this:
c1 c2 c3 c4
0 1 2 3 0
1 10 20 30 1
2 100 200 300 2
3 1 2 3 0
4 10 ...
0
votes
1
answer
84
views
How to remove duplicate rows in pandas DataFrame based on a column?
I have a pandas DataFrame with multiple rows, and some rows have the same value in a specific column (e.g., id). I want to remove the duplicate rows while keeping only the first occurrence (or ...
0
votes
1
answer
67
views
Pandas transform list Column to string
I'm reading the PowerBI GetActivities and have some problems writing the data to a pandas dataFrame.
A given column, called Datasets is sometimes present, and if so its a again a json object with (as ...
1
vote
2
answers
133
views
What's wrong with my python script to separate out tables in excel that have a blank row in between them
I have multiple tables where it's like
Column A
Column B
Cell 1
Cell 2
Cell 3
Cell 4
---Blank row---
Column A
Column B
Cell 1
Cell 2
Cell 3
Cell 4
--- Blank row---
Column A
Column B
Cell 1
Cell 2
Cell ...
2
votes
4
answers
173
views
How to scrape a website that has <span class="ellipsis">…</span> in between number on a dynamic table with Selenium Python
I am trying to scrape dividend data for the stock "Vale" on the site https://investidor10.com.br/acoes/vale3/. The dividend table has 8 buttons (1, 2, 3, ..., 8) and "Next" and &...
3
votes
0
answers
142
views
Can't parse a valid ISO 8601 datetime string pulled from CSV
I have a set of data that I am pulling from an Excel CSV. The column I am using has the timestamps in ISO 8601 format with fractional seconds (YYYY-MM-DDTHH:MM:SS.SSZ)
I have tried using dateutil, ...
7
votes
0
answers
179
views
Column level alignment in pandas DataFrame printing
When a pandas DataFrame is printed, the MultiIndex column levels are aligned with the 1st (left most) column instead of the last (right most) column:
import numpy as np
import pandas as pd
df = pd....
1
vote
2
answers
241
views
Pandas DtypeWarning "Columns have mixed types" for large CSV file (no error with one less line)
I am using Pandas (v2.2.3) to read/load a (relatively large) CSV file using read_csv(). The full file has about 500k lines.
The function throws a DtypeWarning stating that "Columns have mixed ...
0
votes
4
answers
241
views
How can I compare two pandas DataFrames with object-type columns, with a numeric tolerance?
I have two pandas dataframes: One assembled manually in Python, the other imported from a dashboard's .csv output.
All columns in both dataframes are objects, and look like this:
2020
2021
2022
2023
0....
0
votes
4
answers
174
views
How to find values appear the most
I have a dataframe that has the number 6 in each row. Which will be my main number I would like to use
to find values that appear most often with the number 6 that has more than 2 of the same values. ...