289,232 questions
0
votes
0
answers
36
views
Suitable Pandas installation on 32-bit Python (3.10, 3.11)
I am trying to use the Kiwoom OpenAPI (for making automated stock trading program), which requires a 32-bit Python environment. However, to successfully use Kiwoom Open API I need to set up pandas in ...
0
votes
3
answers
62
views
How to modify mulitple columns applying if else to multiple pandas dataframe columns
I have a dataFrame with columns Age, Salary and others, if I used:
df['Age'] = df['Age'].apply(lambda x : x+100 if x>30 else 0)
Then I can modify the Age column with the if else condition. Also, if ...
3
votes
1
answer
110
views
Use pandas merge_asof to achieve inexact left join
I have two pandas series:
right_series
Index
Value
1
0.1
2
0.2
3
0.3
6
0.6
7
0.7
left_series
Index
Value
1
0.1
5
0.5
10
1.0
I would like to join right_series on left_series by the indices, such that ...
0
votes
1
answer
82
views
Why does groupby().apply() produce inconsistent results on identical groups when the DataFrame has overlapping indices?
I noticed that groupby().apply() produces different results for two groups that look identical, except that the overall DataFrame has duplicate index values.
Here is a minimal reproducible example:
...
-1
votes
1
answer
68
views
Using list to find variables from data in rows [closed]
I am trying to find a combination where it will go through the data to find matched variables for any value from the list First_row is found, any value from the list Second_row and any value from the ...
Advice
1
vote
3
replies
126
views
Best way to clean awkward Excel column headers in python/pandas?
I've got four years of daily school attendance data spread across 40+ Excel files (one for each month) and the sheets are set up in a truly annoying fashion, with each date in one merged cell in the ...
2
votes
4
answers
149
views
How to split dataframe into multiple sub-dataframes based on column value
I got a dataframe df1 which looks like this:
Column1
Column2
13
1
12
1
15
0
16
0
15
1
14
1
12
1
11
0
21
1
45
1
44
0
The 1s indicate that a measurement started, I don't know how many 1s will be in one ...
5
votes
2
answers
106
views
How to resample timeseries with origin aligned to start of year
Consider the following pandas Series with a DatatimeIndex of daily values (using day-of-year as an example):
import pandas as pd
dti = pd.date_range("2017-11-02", "2019-05-21", ...
0
votes
2
answers
92
views
How to Create a Pandas Dataframe from JSON Nested Objects [closed]
I'm trying to create a Pandas DataFrame from a JSON file that looks like this:
{
"GameID": "1,218,463,841",
"Date - Start": "1761097369",
"Date - End&...
1
vote
1
answer
109
views
Pandas converts Excel strings like ‘2004E205’ to scientific notation — how to prevent this
How can I handle string values that contain patterns like xxxE205 (e.g., 2004E205), which are used as unique codes in my company? I explicitly read the column as a string in pandas, but values ...
Tooling
0
votes
2
replies
67
views
How to export or import TOON in pandas?
I would like to know how to export or import TOON (Token object oriented notation) in pandas.
2
votes
1
answer
129
views
Problem converting a column to datetime format
I have a data frame and I am trying to convert the time column into a datetime format. The first step I did was:
data['time'] = data.time
data['time']=pd.to_datetime(data['time'], format='%H:%M:%S.%f')...
3
votes
2
answers
211
views
Efficiently get first indices of consecutive identical digits in big pandas DataFrames
I have a DataFrame with a column Digit of digits at base 10. For example
import numpy as np
import pandas as pd
df = pd.DataFrame({
"Digit": [
1, 3, 5, 7, 0, 0, 0,
4, 8, ...
0
votes
1
answer
87
views
How to count appearance of all items in a row on Pandas Dataframe [closed]
II'm currently learning the Pandas library in Python (without AI assistance), and in one of my tasks I needed to count how many times each item appeared in a row of a DataFrame. Here's an example of ...
2
votes
1
answer
73
views
problem on the x-axis of the graph, doesn't render the time
I am working on a dashboard using Shiny for Python and Plotly Express. I am trying to create a Gantt chart (using px.timeline) to visualize the operating periods of different boilers (ON/OFF states).
...
-4
votes
0
answers
75
views
How to combine two pandas DataFrames [duplicate]
I am trying add one pandas DataFrame to another DataFrame. How can I do this in the style of list.append?
usernames = {"anvar":"anvar123", "behruz":"Bex124", &...
Best practices
0
votes
6
replies
109
views
Slow database table insert (upload) with Pandas to_sql. What is the fastest method?
End Objective
Download some data from the internet (about 5 GB in size)
Possibly convert some strings/datetimes
Upload to Postgres database
I have written some code which uploads some data to a ...
1
vote
2
answers
139
views
How to get a true/false without duplicates when comparing two Pandas dataframes?
I have one dataframe with sessions - one session, one row, so SID is unique. The session has a doctor name.
SID
Doctor
Patient
1
robby
david
2
langdon
sara
3
langdon
michael
I have another dataframe ...
12
votes
0
answers
326
views
Not displaying DataFrame's name in Data Wrangler extension of VSCode, displaying "Data grid"
It is a while that I am using Data Wrangler extension in VS Code; it is very useful for analyzing datasets and filtering some columns to see the features. When I opened a dataframe in it, it used to ...
0
votes
1
answer
84
views
set states to indicate on and off with timestamp
def prepare_dataframe(df):
# Map original CSV column names to internal aliases for easier access
df.rename(columns={
'Bomba Calor - Temperatura de Aire (°C)': 'temp_aire',
'...
1
vote
1
answer
96
views
Python, parse nested JSON to make it flat for CSV
I'm trying to store API output into CSV/db and can not figure out how I can make for those Key in "tierList". One row in my case should be on bin and I need key as a columns in my output.
Is ...
4
votes
1
answer
94
views
Export pandas.DataFrame.column.name attribute during pd.to_excel() export
a = np.array(["foo", "foo", "foo", "foo", "bar", "bar",
"bar", "bar", "foo", "foo", &...
0
votes
0
answers
39
views
Pandas merge on one of two criteria [duplicate]
I have a table/df that holds a set of code and value pairs. The codes are a mix of old (legacy) and new codes due to process changes. I have a second table/df that holds the old codes, new codes, ...
0
votes
1
answer
55
views
No module named 'pyspark.sql.metrics' when working with pickle or joblib on Databrick
I read data from Databricks
import pandas as pd
import joblib
query = 'select * from table a"
df = spark.sql(query)
df = df.toPandas()
df.to_pickle('df.pickle')
joblib.dump(df, 'df.joblib')
...
Advice
0
votes
0
replies
48
views
List of paths and sizes to sunburst plot
I have the following data format:
foo0/bar0/a0/b0/c0 50
foo0/bar0/a0/b1 30
foo1/bar0/a0/b0 10
foo1/bar1 20
foo1/bar2/a0/b0/c0 20
foo1/bar2/a0/b0/c1 30
I'd like to create a sunburst plot out of this. ...
1
vote
1
answer
59
views
use of melt into a df for long to wide and df.loc
i'd like to know if there is another better way of using df_filtered_dates = df.loc[start_date:end_date] or of this way it's good, i'm using for filter dates between the dates i choose from ...
-3
votes
1
answer
97
views
create dataframe from csv in PythonAnywhere [closed]
I am trying to display the headers of a data frame I created based on a csv file using the PythonAnywhere free version. I keep getting a huge error message and I don't understand what I did wrong.
...
Best practices
0
votes
3
replies
99
views
How to access specific, indexed elements of a Pandas Dataframe for math?
What is the right/pythonic way to do math on a few indexed elements in a Pandas Dataframe?
I tried a few ways but they seem awkward and confusing:
df = pd.DataFrame({'x': [1, 2, 3, 4, 5, 6, 7, 9, ]})
...
0
votes
1
answer
79
views
Pandas dataframe headers into a openpyxl table
Essentially I haven't been able to parse a pandas DataFrame into an Excel table properly, I have the output working but the headers remain being passed as if the were an extra row
ws.append([])
ws....
1
vote
1
answer
62
views
KeyError when trying to access Symbol columns from NASDAQ Other Symbols FTP export
I am trying to make a program to get a list of all (or at least almost all) USA listed stocks on all exchanges. I got AI to generate the following program suggestion:
import pandas as pd
from ftplib ...
1
vote
2
answers
86
views
How to store number as a text while exporting from pandas to excel
I have a DataFrame that contains an A/C number column, where some values are longer than 15 digits. When I export the DataFrame to Excel using .to_excel(), Excel automatically converts these long ...
1
vote
1
answer
65
views
How to assign int input value as column name in pandas [duplicate]
I need to take an integer input value, assign it to a variable, and then use that variable as a column name to get data from a pandas DataFrame.
Data:
1 10 20 30
2 40 50 60
Steps:
Assign input to ...
1
vote
0
answers
64
views
Why does DataFrame.apply() use axis=1 for rows instead of axis=0 in pandas? [duplicate]
I was reading the pandas documentation:
pandas.DataFrame.apply documentation
In this definition, it seems to me like the use of axis is the exact opposite of what it means in other functions?
To apply ...
1
vote
1
answer
113
views
Different epoch time from the same datetime
Here is my code:
import pandas as pd
import datetime
df = pd.DataFrame({'str_date': ['2023091004']})
df['epoch'] = pd.to_datetime(df['str_date'], format='%Y%m%d%H').astype(int) // 10**9
dt = ...
0
votes
0
answers
59
views
entsoe-py query_imbalance_(prices|volumes) fails with ValueError: invalid literal for int(): '1,346' in parser — best fix?
I’m fetching ENTSO-E imbalance prices/volumes with entsoe-py and hit a parser crash because the <position> field contains a thousands separator comma (e.g. "1,346"), which int() can’t ...
0
votes
0
answers
118
views
How to get value of cell in dataframe
I'm new to Python, so please be lenient.
I want to read the value of a single cell from a dataframe based on the selected row. I do this as below, but I get the value not when I click on a record (...
1
vote
2
answers
153
views
After encoding my categorical columns in a pandas dataframe, I was left with too many columns. How can I drop some?
I am using Python with a pandas dataframe, it is a CSV of Steam games, and I have the categorical columns of publishers, developers, categories, genres, and tags, but categories, genres, and tags are ...
3
votes
2
answers
134
views
Calculate cumulative value based on another column [duplicate]
Having this kind of pandas dataframe
df = pd.DataFrame({
'ts_diff':[0, 0, 738, 20, 29, 61, 42, 18, 62, 41, 42, 0, 0, 729, 43, 59, 42, 61, 44, 36, 61, 61, 42, 18, 62, 41, 42, 0, 0]
})
ts_diff - is ...
-6
votes
1
answer
116
views
Using for loop to find a common value for every time another common value appears
import pandas as pd
a=1
b=2
c=3
for n in range(10, len(df)-1):
if df.loc[n].isin([a]).any() and df.loc[n].isin([b]).any() :
for x in range(0, ...
Best practices
0
votes
7
replies
122
views
Value count accuracy
I was trying to calculate the accuracy of my roberta label and pre existing dataset label
i was confused on how to operate pandas's value_counts(), i dont know how to do operations on it before, the ...
3
votes
3
answers
193
views
Filter a pandas df: per group, keep only non-null rows if we have them, else keep a single null row
Hopefully the title is reasonably intuitive, edits welcome. Say I have this dataframe:
df = pd.DataFrame({'x': ['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D'],
'y': [None, None, 1, 2, 3, 4,...
Best practices
0
votes
5
replies
204
views
What is the ideal production-grade infrastructure to deploy a Python FastAPI service for computing option Greeks (e.g., Black–Scholes)?
I am building a python service to compute Greek values using python, but I have never worked on python, and I would appreciate advice from the experienced python devs as it would help me a lot.
FLOW
...
2
votes
1
answer
121
views
Concatenate Tables Based on Column Information in Python [duplicate]
I have a dataframes pulled from a file. The variable with all these dataframe names is: Data_Tables.
These dataframes all have the same columns, and I want to concatenate the dataframes based on the ...
2
votes
4
answers
128
views
How to find a common value using if statement
I am still a beginner in python. I am trying to find a common value with if statement,
import pandas as pd
df = pd.read_csv("data.csv")
for n in range(2, len(df)):
if df.loc[n].isin([2]...
5
votes
3
answers
189
views
How to use Pandas style objects to format values with a hyperlink based on the index?
I want to format a Pandas DataFrame with a hyperlink based on the index.
import pandas as pd
df = (pd.DataFrame([dict(food='bananas', count=33, color='yellow'),
dict(food='apples',...
1
vote
0
answers
78
views
Why does the pivoted dataframe contain information about columns that weren't included in the pivot? [duplicate]
There is a dataframe with a multiindex columns:
import pandas as pd
df = pd.DataFrame({
"A": ["foo", "foo", "bar", "bar"],
"B": [&...
5
votes
3
answers
168
views
Slice a pandas dataframe at specific index points
I have below pandas dataframe
import pandas as pd
data = pd.DataFrame({'x1':range(10, 18), # Create pandas DataFrame
'x2':['a', 'b', 'b', 'c', 'd', 'a', 'b', 'd'],
...
1
vote
1
answer
90
views
Awswrangler: Parquet read into multiple of expected space
In a Lambda, I'm using AWS Wrangler to read data out of a date partitioned set of parquets and concatenate them together. I am doing this by calling wr.s3.read_parquet in a loop, compiling the loaded ...
3
votes
1
answer
215
views
How to extract table from PDF with boxes into pandas dataframe
I have code that detects a table in a PDF that appears after a specific section, and parses the information in the table and copies it into a pandas dataframe.
Now, I want to indicate whether a box is ...
0
votes
0
answers
135
views
New installation get this: main loop could not convert string '2023-01-02' to float64
I have just installed the new Mint 22.2 as my old hard disk with Mint 21 seems to have some issues. I use Eric-IDE to execute my .py script.
The script is first downloading stock values from a certain ...