Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
83 views

I have a large pandas dataframe df of something like a million rows and 100 columns, and I have to create a second dataframe df_n, same size as the first one. Several rows and columns of df_n will be ...
MBlrd's user avatar
  • 165
5 votes
1 answer
303 views

My intuition when using Pandas is that, if you have to use df.apply, it would be more optimal to group all the apply operations into one call. This was further reinforced by me learning that NumPy ...
v0rtex's user avatar
  • 53
2 votes
2 answers
101 views

The following seems to work: import pandas as pd import sklearn df = sklearn.datasets.load_iris() df = pd.DataFrame(df.data, columns=df.feature_names) df.shuffle() However this shuffle function seems ...
robertspierre's user avatar
3 votes
1 answer
184 views

I'm working in Jupyter notebooks trying to build a stacked and filled x,y scatter bar chart from the dataframe (df_xy_columns) below: sum_y_gran PVR Group min_x min_x2 max_x max_x2 min_y ...
rfulks's user avatar
  • 33
4 votes
1 answer
168 views

I have the following script that crashes when I run it and I cannot figure out why. The script is a smaller version of a larger script, but still reproduces the error of the larger script. import ...
Claude Simon's user avatar
9 votes
5 answers
371 views

I am working on a task that seems to me a little like one-hot encoding, but notably different. What I want to do is take a row of integers from a Pandas DataFrame and produce a binary column with 1's ...
lane-h-rogers's user avatar
2 votes
2 answers
147 views

I have an array of names and roles of people within a company: Example array: names_and_titles = [ ("Samantha Reyes", "Innovation", "Product Owner"), ("Ethan ...
Imam's user avatar
  • 41
7 votes
2 answers
230 views

I'm working with a dictionary that contains a list of decimal.Decimal values as one of its fields: import pandas as pd from decimal import Decimal data = { 'Item': ['Apple', 'Banana', 'Orange'], ...
Gino's user avatar
  • 913
1 vote
1 answer
117 views

I have a pandas pivot table that shows payments made to different payees vs date, and I'm using a Grouper to group them into months, e.g.: payee payee_1 payee_2 date 2019-11-30 amount ...
Paul Worrall's user avatar
-1 votes
1 answer
70 views

i need some help. have got a part of a python script which accesses a url field in a sql database, and then calls an api based using the url in the field. Now i cannot get the data into a dataframe to ...
Trevor Turn's user avatar
0 votes
1 answer
126 views

df = pd.DataFrame({ 'col_str': ["a", "b", "c"], 'col_lst_str': [["a", "b", "c"], ["d", "e", "f"], [&...
Alexis's user avatar
  • 1,663
4 votes
3 answers
136 views

Why is pandas not formatting dates with date_format argument of to_csv? pandas.DataFrame([datetime.datetime.now().date()]).to_csv(date_format="%Y %b") ',0\n0,2025-07-31\n'
Hugo Trentesaux's user avatar
1 vote
1 answer
138 views

I am using Azure Databricks and Azure Data Storage Explorer for my operations. I have an excel file of under 30 MB containing multiple sheets. I want to replace the data in one sheet every month when ...
spacestar's user avatar
-3 votes
1 answer
87 views

I have a syntax like below and would like to convert this to python executable statement. The below is stored as it is in the database and used in a procedure for calculating the required value. Now I ...
Jayanth's user avatar
4 votes
5 answers
297 views

I'm working on a data cleaning task and could use some help. I have two CSV files with thousands of rows each: File A contains product shipment records. File B contains product descriptions and ...
user avatar
4 votes
4 answers
175 views

I have a dataframe that looks something like this: 1 2 3 'String' '' 4 X '' '' 5 X '' '' 6 7 'String' '' 1 Y '' And I want to change the Xs and Ys (put here just to visualize) to the ...
Lucas P's user avatar
  • 47
1 vote
2 answers
134 views

I have the reverse problem as described in Prevent pandas from interpreting 'NA' as NaN in a string. I work with older English text data and want to write the word "nan" (i.e. Modern ...
Mat's user avatar
  • 525
6 votes
5 answers
326 views

I have N numbers, call it 3 for now: A1, A2, A3. I'd like to generate the following dataframe in Pandas: Category 1 2 3 4 5 6 7 1 A1 A1+A2 A1+A2+A3 A2+A3 A3 0 0 2 0 A2 A2+A3 A2+A3+A1 A3+A1 A1 0 3 0 0 ...
Matta's user avatar
  • 207
-2 votes
2 answers
186 views

In the code example below I am grouping a pandas series using the same series but with a modified index. The groups in the end make no sense. There is no warning or error. Could you please help me ...
karpan's user avatar
  • 597
2 votes
2 answers
93 views

We're trying to group up date counts by month and index values are returning as decimals instead of integers when series contain any number of NaTs / na values. Simplified reproducible example: import ...
Chris Dixon's user avatar
  • 1,148
1 vote
0 answers
51 views

I’m using rpy2 in Python to call R's forecast::stlm() function from within a custom wrapper function defined in R. My goal is to fit a seasonal time series model (STL + ARIMA) on a univariate time ...
RSK's user avatar
  • 765
0 votes
2 answers
77 views

this is my df: symbol year_bin metric value row 0 USA500.IDX 2025-1 total_trades 32.00 0 1 GBPUSD 2025-1 total_trades 11.00 0 2 GBPUSD 2025-1 ...
Amir's user avatar
  • 3
2 votes
1 answer
91 views

I have 2 dataframes. One is small with lesser columns of the other one. I want to update df1 with values from the available columns in df2. How do I do it? Eg: df1: Jan Feb Mar Apr May Jun Jul Aug ...
Anupkumar Kasi's user avatar
1 vote
1 answer
175 views

I have the following code in Python, using Streamlit as framework: try: native_data = data.copy() # Create Altair chart with native data st.write(f"Debug: Native data type: {type(...
HuLu ViCa's user avatar
  • 5,515
1 vote
3 answers
90 views

This file is called 'html app.py' from flask import Flask, render_template, request import yfinance as yf import seaborn as sns import matplotlib.pyplot as plt import io import base64 app = Flask(...
rashmip_21's user avatar
1 vote
1 answer
79 views

I have a Pandas dataframe df with a datetime index and three columns, like this: Out[64]: rh pm25a pm25b time_stamp 2022-07-06 11:35:...
ValeA's user avatar
  • 11
1 vote
1 answer
65 views

I am trying to use pandas.set_option for my python script to display a table but some how the data does not fill properly in an html page Since the names in some column are bit longer , columns look 1 ...
Kapil's user avatar
  • 325
1 vote
1 answer
111 views

I have a large dataframe which I need to upload to SQL server. Due to volume of data, my code does the insert in batches. But, I am facing insert failure if the batch has more than 1 record in it. The ...
Abhishek Sourabh's user avatar
3 votes
1 answer
68 views

I have created the following pandas dataframe: import pandas as pd import numpy as np ds = {'col1' : [234,321,284,286,287,300,301,303,305,299,288,300,299,287,286,280,279,270,269,301]} df = pd....
Giampaolo Levorato's user avatar
2 votes
1 answer
71 views

I have some parameters: A1, A2, A3, f1, f2, f3. These parameters are then used to generate a set of sinusoidal data, something like: y = A1 * sin(f1 * x) + A2 * sin(f2 * x) + A3 * sin(f3 * x) From ...
PentaGeer Joshua Meetsma's user avatar
0 votes
2 answers
83 views

I'm working with 5-min level data that only includes timestamps between 09:30 and 16:00. (dateTime is saved as column not as index) after applying operation to the group, I get additional data beyond ...
JoonHak Kim's user avatar
0 votes
1 answer
73 views

import pandas as pd df = pd.read_csv('911.csv') df['desc'].str.replace('[^a-zA-Z0-9]','').head() 0 REINDEER CT & DEAD END; NEW HANOVER; Station ... 1 BRIAR PATH & WHITEMARSH LN; ...
david yen2's user avatar
1 vote
0 answers
62 views

I am working on a large CSV file that contains number IDs for translations followed by entries for different languages. These entries represent localization strings in an application. I was tasked ...
Hadi Farah's user avatar
  • 1,180
5 votes
4 answers
182 views

I have the following data in a CSV. "ID","OTHER_FIELDS_2" "87","25 R160 22 13 E" "87","25 R165 22 08 E" "77","" &...
learner's user avatar
  • 53
0 votes
1 answer
77 views

I have a DataFrame with monthly data that looks something like this: id date window_in_months value 1 2000-01-01 3 20 1 2000-02-01 3 30 2 2000-01-01 12 40 2 2000-02-01 12 60 I want to do a rolling ...
LattePrincess's user avatar
0 votes
1 answer
100 views

Is there a way in KDB/pykx to get only some columns as raw data while get others converted to pandas types? In the example below, I want to be able to do what is shown in the last line (for variable ...
S.V's user avatar
  • 2,855
4 votes
2 answers
171 views

I have a sample data frame like this: Id application is_a is_b is_c reason subid record 100 app_1 False False False test1 4 record100 100 app_2 True False False test2 3 ...
N9909's user avatar
  • 297
4 votes
1 answer
131 views

import pandas as pd import yfinance as yf import mplfinance as mpf df = yf.download('AMZN', start='2020-01-01', end='2025-07-31') print(df) mpf.plot(df['2020-01-01':'2020-06-01'], type='candle', ...
rashmip_21's user avatar
1 vote
4 answers
113 views

This code generates 4 separate box plots. How can i generate only one box plot for the entire matrix? import numpy as np import pandas as pd data = np.random.random(size=(4,4)) df = pd.DataFrame(data) ...
stefaniecg's user avatar
1 vote
1 answer
108 views

I have workouts logged in JSON like this: [ { "date": "2025-07-14", "workout_name": "Lower", "exercises": [ { "name&...
Stilvens Parm's user avatar
0 votes
1 answer
77 views

I had something like the following code using pandas 1.x that new generates a warning in pandas 2: import pandas as pd import numpy as np df1 = pd.DataFrame({"i":[1,2,3,4,5], "a":[...
guest's user avatar
  • 139
0 votes
1 answer
59 views

trying to Compare 2 Columns lag2open to MGC=F and return if it is higher and returning it as Higher than 0 using GCClose["Higher than 0"] = [GCClose.columns[1]]>= [GCClose.columns[0]] it ...
Rafael Alexandre Sousa's user avatar
0 votes
1 answer
209 views

What is best way to convert a FastAPI query into a Polars (or pandas) dataframe. Co-pilot give this. with Session(engine) as session: questions = session.exec(select(Questions)).all() ...
diogenes's user avatar
  • 2,181
0 votes
1 answer
138 views

SOLUTION as of 16JUL25: See rotabor's float_precision answer for trailing zero problem. To solve thousands separator problem gracefully without unnecessary steps, do NOT bother using polars....
JeffCh's user avatar
  • 97
2 votes
1 answer
99 views

I have a data frame with a variety of string values. For a given column, if there is any string entered, I would like to replace it with the same value (say 'fruit'). Example: data = {'item_name': ['...
Liz's user avatar
  • 365
2 votes
1 answer
71 views

I have created the following pandas dataframe, which is an example of 26 stock prices (Open, High, Low, Close): import pandas as pd import numpy as np ds = { 'Date' : ['15/06/2025','16/06/2025','17/...
Giampaolo Levorato's user avatar
2 votes
1 answer
67 views

I want to find the minimum value per row and create a new column indicating which of those columns has the lowest number. Unfortunately, it seems like pandas isn't immediately able to help in this ...
Corsaka's user avatar
  • 464
1 vote
1 answer
123 views

How can I simplify the code below and make it more efficient using chained operations? Currently, I am creating intermediate objects and using a for loop. I use this data: https://www.kaggle.com/...
just a tw highschooler's user avatar
-1 votes
4 answers
262 views

My task is to parse the protein names by removing the brackets and parentheses in the row. In short, I want to retain the words in front of any parentheses and brackets. Note that I need to keep ...
Ssong's user avatar
  • 466
1 vote
1 answer
102 views

I have a dataframe like below, and would like to fill value from previous row value based on id field, so, any record with 4 in colA get what the previous colA=3 records' colC value, colB stays the ...
Connie Xu's user avatar

1
3 4
5
6 7
5785