Newest 'pandas' Questions - Page 4

0 votes

1 answer

56 views

Illegal instruction when importing pandas along with wxpython (python 3, Windows)

Trying to upgrade my python 2.7 scripts to the latest python 3.x for my PC running Windows 7-x64. To do so, I installed python 3.8.9. I get a nasty error when I press ENTER after typing "import ...

papin

81

asked Sep 4 at 18:27

3 votes

4 answers

194 views

Using multiple masks based on ID

I have a dataframe df that consists of two columns: an id, and a date. The id is a number from 1-3 & is not unique; the date is a datetime object. id, date 1, 2020-5-11 1, 2019-3-2 2, 2018-7-29 3, ...

NotLost

143

asked Sep 3 at 23:18

0 votes

1 answer

47 views

How to read LZO compressed pandas HDF5 files?

Here is the situation: I have data saved into pandas HDF5 files. Some data is compressed using lzo and some using blosc:zstd. Under RHEL-7, I was able to read both types of files. Then, I was ...

S.V

2,855

asked Sep 3 at 14:36

1 vote

1 answer

76 views

ModuleNotFoundError: No module named 'pandas' on Raspberry Pi even after installing from requirements.txt

I’m running a Python project on a Raspberry Pi (Debian / Raspberry Pi OS, Python 3.9.2). The project has its own virtual environment (.venv), and I launch it with a script (alsv4) that starts multiple ...

May Ochia

13

asked Sep 3 at 13:10

6 votes

4 answers

250 views

Pandas continous time periods

Given a table: id cost from to 43 4 2025-09-01 01:00:00 2025-09-01 01:30:00 42 4 2025-09-01 01:30:00 2025-09-01 02:00:00 41 4.8 2025-09-01 02:00:00 2025-09-01 02:30:00 40 4.05 2025-09-01 02:30:00 2025-...

Hal

131

asked Sep 1 at 13:44

3 votes

2 answers

314 views

Can a DataFrame have multiple, different types in the same column?

I have a DataFrame, as shown below. In order to build it, I started with adding the numbers (100 for spec A Sum in 2020 and so on). Additionally I add the median as a date. 2020 ...

Stefan Bongers

55

asked Aug 31 at 19:22

1 vote

1 answer

82 views

Combine separate plots into one plot in Python

I have created the following pandas dataframe: ds = { 'Date' : ['2025-08-22 16:00:00', '2025-08-22 16:01:00', '2025-08-22 16:02:00', '2025-08-22 16:03:00', '2025-08-22 16:04:00', '2025-08-...

Giampaolo Levorato

1,762

asked Aug 31 at 15:57

-1 votes

1 answer

98 views

How to sum up a particular row of number inputs in a DataFrame [closed]

I want to add the inputs in a particular row of data frame as total. But the output is XYZ: OBJECT1 Object2 NUMBERS: 1 2 Where the inputted XYZ are Object2 and OBJECT1 and the inputted Numbers are 1 ...

Hydra

19

asked Aug 31 at 13:22

3 votes

3 answers

156 views

splitting a data frame in a way like dealing cards

I have a data frame of students sorted by grade that I want to split into 3 data frames, such that an even number of students per grade is in each of the 3 groups. I thought of it like dealing cards, ...

eriknau

31

asked Aug 31 at 1:47

1 vote

1 answer

168 views

Sorting a column of max values from a multicolumn pandas data frame

I have a multi-column pandas data frame of years and corresponding cumulative rainfall values from 1 to 183 (October to March). That means in each column the last value is the maximum column value, ...

Zilore Mumba

1,594

asked Aug 30 at 19:16

0 votes

0 answers

76 views

How can I recreate ewm(adjust=False).std() in Pandas?

This recreates ewm(adjust=True).std(): pandas ewm var and std, but I have no luck replicating the calculations in ewm(adjust=False).std(). Replicating ewm(False).mean() is easy but how is the bias ...

Luluz

103

asked Aug 30 at 12:35

0 votes

2 answers

137 views

Pandas merge one-to-many [duplicate]

I'm trying to merge two pandas DataFrames on multiple columns. It is a many-to-one relationship. There are many of the same values in df1 but only value in df2. These are the example DataFrames : df1 =...

thor

283

asked Aug 29 at 13:40

-2 votes

1 answer

65 views

drop.na code doesn't show up when using print(data_frame) [duplicate]

I wanna delete missing values from a certain column: #deleting rows with missing values data_excel.dropna(subset=['Budget Betrag']) then I wanna check whether it's working with print(data_excel) But ...

Lisa Nagel

15

asked Aug 29 at 8:57

2 votes

1 answer

84 views

How to do linear interpolation in PySpark without Pandas UDF (only using Spark API)?

I have a Spark DataFrame with the following structure: shock_rule_id DATE value A 2024-01-01 100 A 2024-01-02 null A 2024-01-03 130 B 2024-01-01 50 B 2024-01-02 null B 2024-01-03 null B 2024-01-04 80 ...

Abhishek

21

asked Aug 29 at 4:51

2 votes

2 answers

107 views

Pandas read_csv: Skip rows contains invalid data that can cause data_type parsing errors

The csv file can contain string values to certain integer columns and I want to ignore/handle via callback if that happens, tried using on_bad_lines='skip/warn' however it gets triggered only on ...

Despicable me

774

asked Aug 29 at 2:27

2 votes

2 answers

66 views

How to retrieve one data value from the result of a pandas DataFrame.groupby().mean()

Using Pandas 2.3.2 on Python 3.9.2 via JupyterLab. I've collected a bunch of thermal data from a thing. I've already collated that data into DataFrame chunks that look like this: zone data ...

Brian A. Henning

1,577

asked Aug 28 at 20:44

3 votes

3 answers

106 views

Pandas read_csv, load empty/missing column values as NaN while loading empty string for quoted empty strings values in csv file

My csv file contains empty string "" as well as missing column values ,,. When i am loading with read_csv(), both are loaded as either empty string or NaN depending on keep_default_na and ...

Despicable me

774

asked Aug 28 at 19:34

2 votes

1 answer

147 views

What is NaT in Pandas?

I have a dataframe with some "NaT" values in a datetime column. What does that mean? project status completed 0 windows done 2025-08-20 1 doors done 2025-08-21 2 hvac ...

wjandrea

33.9k

asked Aug 28 at 17:08

0 votes

1 answer

68 views

Brier Skill Score returns NaN in cross_val_score with imbalanced dataset

I’m trying to evaluate classification models on a highly imbalanced fraud dataset using the Brier Skill Score (BSS) as the evaluation metric. The dataset has ~2133 rows and the target Fraud_Flag is ...

Br0k3nS0u1

139

asked Aug 28 at 12:20

3 votes

1 answer

75 views

How to pass argument to func in `pandas.resampler.agg()` when using dict input?

I am trying to resample a pandas dataframe, and for some columns I would like to sum on. additionally, I want to get None/nan as result when there is no rows in a resampling period. For aggregation on ...

KamiKimi 3

97

asked Aug 28 at 10:16

1 vote

0 answers

137 views

Conversion of a pyspark DataFrame with a Variant column to pandas fails with an error

When I try to convert a pyspark DataFrame with a VariantType column to a pandas DataFrame, the conversion fails with an error 'NoneType' object is not iterable. Am I doing it incorrectly? Sample code: ...

Ghislain Fourny

7,429

asked Aug 27 at 11:32

7 votes

3 answers

441 views

How to sort pandas groups by (multiple/all) values of the groups?

I am trying to do a somewhat complicated group and sort operation in pandas. I want to sort the groups by their values in ascending order, using successive values for tiebreaks as needed. I have read ...

Jessica

1,813

asked Aug 26 at 20:54

0 votes

2 answers

203 views

"Cannot set a DataFrame with multiple columns to the single column" when script is in function

I have a function which processes dataframe of 6 columns. It looks like this: def Process_DF(): DF_6cols = "some data" #Two functions to split column containing Column Val1 and ...

Danylo Kuznetsov

25

asked Aug 26 at 20:13

0 votes

1 answer

74 views

I am trying to create a separate data frame when specific conditions are met [duplicate]

I have a data frame that consists of a column that contains the Gender data. I want to segregate it by gender and create 2 separate dataframes. I tried to do this by implementing the code below: for i ...

akhilesh mudliar

9

asked Aug 26 at 14:41

2 votes

0 answers

248 views

Illegal instruction (core dumped) when running Streamlit app on Raspberry Pi 4

I’m trying to run a small Streamlit app on my Raspberry Pi 4. For testing, I made a small version with just core functionality: # main.py import pandas as pd import streamlit as st def main(): st....

ole

11

asked Aug 26 at 13:29

0 votes

2 answers

93 views

python basics: can someone help me understand processing one file vs processing all files at a time?

This will output all csv files from the directory, but only show one of the csv dataframes. OUTPUT_PATH = "./static/output/" FILE_LIST = glob.glob("./static/*.json") def all_data():...

shrykullgod

43

asked Aug 25 at 22:31

-2 votes

1 answer

234 views

pandas conert date failed - pd.to_datetime(df['xxx'], format='%Y-%m-%d').dt.date

I am facing one little problem. I am storing some date time data and the data is #secCode,secName,announcementTitle,announcementId,announcementTime 003816,xxx name,2024report,1222913141,1743004800000 ...

user824624

8,170

asked Aug 24 at 23:44

4 votes

1 answer

145 views

How do you remove already filtered category values from DataFrames from plots and pivot tables?

My dataframes show video game titles, platforms, year of release, revenue, etc. I have filtered the original dataframe "df_samplegames", which has 29 different platforms (type category), ...

RicardoDLM

73

asked Aug 24 at 19:10

1 vote

0 answers

62 views

Pandas Styler.bar() not showing on Excel column

I'm working on a default style for some reports I have to do. I'd like to add the Styler.bar() method. Sample of the dataset used for integration: symbol,date,open,high,low,close,volume AAL,2014-01-02,...

ludovico

95

asked Aug 24 at 15:20

4 votes

4 answers

572 views

Find max value of a column, then find another value in the same row, and copy that value to a new column [closed]

I have the following frame: lst = [ ['SPXW 250715C06310000', '7/14/2025', 2.74, 2.87, 2.60, 2.65, 14, '8:30:00'], ['SPXW 250715C06310000', '7/14/2025', 2.80, 2.80, 2.50, 2.53, 61, '8:31:00'], ...

Dan

111

asked Aug 24 at 3:34

2 votes

1 answer

95 views

graph_objects.Surface axis tic spacing inconsistent on x and y

Given hemi.csv data of: 244,1000,1500,2000,2500,3000,3500,5000 0,14,18,-42,-72,-84,-86,-94,-119 12.5,277,231,185,139,144,150,161,158 25.1,416,394,370,348,361,374,404,396 37.6,483,587,633,653,566,585,...

R Schumacher

21

asked Aug 22 at 22:07

3 votes

6 answers

282 views

removing rows that don't fit the repeating sequence in pandas dataframe

I have a pandas dataframe that looks like this: A B C D 0 1 2 3 0 1 4 5 6 1 2 7 8 9 2 3 10 10 10 0 4 10 10 10 1 5 1 2 3 0 6 4 5 6 1 7 7 ...

AjWinston

129

asked Aug 21 at 21:47

3 votes

1 answer

105 views

Convert dictionary rows to new dataframe

After importing some nested JSON data, I'm trying to create a new dataframe from all of the dictionary key / value pairs in an existing column. Starting point: >>> df['schedules'] 0 {'...

skohrs

831

asked Aug 21 at 17:46

2 votes

3 answers

162 views

How to rewrite python code using Pyscript

My code works as python file but I am struggling to make it work using pyscript.I am sharing the code which I tried. main.py import pytesseract pytesseract.pytesseract.tesseract_cmd = r"Tesseract-...

nasrin begum pathan

105

asked Aug 21 at 15:34

1 vote

1 answer

67 views

How to dynamically rename headers in a bank statement CSV/Excel using Python and Pandas?

I have bank statements in both Excel and CSV formats. The headers can vary slightly depending on the bank or the file export, for example: TRAN_DATE, CHQNO, PARTICULARS, DR, CR, BAL, SOL I want to ...

Nitesh Kumar Singh

174

asked Aug 21 at 15:08

2 votes

1 answer

87 views

How to add value from a list as new column value in dataframe, if existing column value starts with that value from list

I have a dataframe which looks like shown below: CALL_START IMSI 0 24.07.2025 12:00:51 123456888888888 1 24.07.2025 17:58:57 123456999999999 2 24.07.2025 17:05:47 ...

urosdigital

31

asked Aug 21 at 11:54

4 votes

2 answers

106 views

What to do when the pandas error position overflows?

So, I'm experimenting with pandas with the IMDB files, especially title.basic.tsv. When trying to parse the runtimeMinutes column to "Int64", I get an error ValueError: Unable to parse ...

red_trumpet

685

asked Aug 21 at 9:59

4 votes

1 answer

113 views

Find max/min value in a column in a range of data (multiindex) and append to a different column

I have the following dataframe: import pandas as pd import csv lst = [['SPXW 250715C06310000', '7/14/2025', 2.74, 2.87, 2.60, 2.65, 14, '8:30:00'], ['SPXW 250715C06310000', '7/14/2025', 2.80, ...

Dan

111

asked Aug 21 at 5:06

2 votes

0 answers

54 views

broken x axis and broken dual y-axes - draw replot across subplots

My post relates to this one here: Formatting a broken y axis in python matplotlib I have borrowed code from this post and adapted it to what I am doing. I am attempting to create a graph whereby I am ...

jmcgowan

23

asked Aug 21 at 1:16

5 votes

3 answers

167 views

Pandas - return the -2 row

If I have an input.txt file: apples grapes alpha pears chicago paris london yellow blue red +++++++++++++++++++++ apples grapes beta pears chicago paris london car truck ...

yodish

881

asked Aug 20 at 20:40

8 votes

6 answers

558 views

Concatenating a range of rows in pandas

I have a pandas dataframe like this: c1 c2 c3 c4 0 1 2 3 0 1 10 20 30 1 2 100 200 300 2 3 1 2 3 0 4 10 ...

AjWinston

129

asked Aug 20 at 19:57

0 votes

1 answer

84 views

How to remove duplicate rows in pandas DataFrame based on a column?

I have a pandas DataFrame with multiple rows, and some rows have the same value in a specific column (e.g., id). I want to remove the duplicate rows while keeping only the first occurrence (or ...

Ruchin Patel

1

asked Aug 19 at 20:58

0 votes

1 answer

67 views

Pandas transform list Column to string

I'm reading the PowerBI GetActivities and have some problems writing the data to a pandas dataFrame. A given column, called Datasets is sometimes present, and if so its a again a json object with (as ...

Harry Leboeuf

1

asked Aug 19 at 18:31

1 vote

2 answers

133 views

What's wrong with my python script to separate out tables in excel that have a blank row in between them

I have multiple tables where it's like Column A Column B Cell 1 Cell 2 Cell 3 Cell 4 ---Blank row--- Column A Column B Cell 1 Cell 2 Cell 3 Cell 4 --- Blank row--- Column A Column B Cell 1 Cell 2 Cell ...

empowHERek

27

asked Aug 14 at 22:16

2 votes

4 answers

173 views

How to scrape a website that has <span class="ellipsis">…</span> in between number on a dynamic table with Selenium Python

I am trying to scrape dividend data for the stock "Vale" on the site https://investidor10.com.br/acoes/vale3/. The dividend table has 8 buttons (1, 2, 3, ..., 8) and "Next" and &...

user30126350

45

asked Aug 14 at 10:04

3 votes

0 answers

142 views

Can't parse a valid ISO 8601 datetime string pulled from CSV

I have a set of data that I am pulling from an Excel CSV. The column I am using has the timestamps in ISO 8601 format with fractional seconds (YYYY-MM-DDTHH:MM:SS.SSZ) I have tried using dateutil, ...

user31262016

43

asked Aug 12 at 16:56

7 votes

0 answers

179 views

Column level alignment in pandas DataFrame printing

When a pandas DataFrame is printed, the MultiIndex column levels are aligned with the 1st (left most) column instead of the last (right most) column: import numpy as np import pandas as pd df = pd....

sds

60.5k

asked Aug 11 at 18:26

1 vote

2 answers

241 views

Pandas DtypeWarning "Columns have mixed types" for large CSV file (no error with one less line)

I am using Pandas (v2.2.3) to read/load a (relatively large) CSV file using read_csv(). The full file has about 500k lines. The function throws a DtypeWarning stating that "Columns have mixed ...

Andreas

113

asked Aug 11 at 9:18

0 votes

4 answers

241 views

How can I compare two pandas DataFrames with object-type columns, with a numeric tolerance?

I have two pandas dataframes: One assembled manually in Python, the other imported from a dashboard's .csv output. All columns in both dataframes are objects, and look like this: 2020 2021 2022 2023 0....

gorilla

47

asked Aug 10 at 14:38

0 votes

4 answers

174 views

How to find values appear the most

I have a dataframe that has the number 6 in each row. Which will be my main number I would like to use to find values that appear most often with the number 6 that has more than 2 of the same values. ...

Chris

63

asked Aug 10 at 7:49

Collectives™ on Stack Overflow