Skip to main content
Filter by
Sorted by
Tagged with
2 votes
2 answers
106 views

I have this details. import pandas as pd import numpy as np # Sample dataset data = { "date": pd.date_range("2025-10-01", periods=7), "sales": [200, 220, 250, np....
Sefineh Tesfa's user avatar
-1 votes
1 answer
178 views

I've a dataset having a column: Joining_date with different date formats. I want to convert them in a one same date format df['Joining_Date'] 0 2016-10-26 2 16/08/2018 3 2017/06/...
Hafsa Ali's user avatar
1 vote
2 answers
240 views

I've tried to put in the NumPy scientific notation function in different places but that didn't work, other than that I haven't been able to find anything that works def optimal_resistor(name="&...
riley britchford's user avatar
1 vote
1 answer
134 views

I am working on a Flask app constructed as follows: The starting HTML template, controlled by a view, displays a text box, where the user can enter a SQL request to request a SQLite database ...
Corsair's user avatar
  • 749
0 votes
3 answers
224 views

I’m working on a project where I process a large dataset of CSV files. Each file has a name like this: NATURALGAS21FEB25270CE.MCX and here i want to extract the following things -: 21FEB25 -DATE 270-...
its m's user avatar
  • 49
1 vote
0 answers
108 views

I am using pyright in strict mode and I would like to avoid littering my code with exceptions telling pyright to ignore some line. I have been using my own stub file to great success in the case of ...
Chechy Levas's user avatar
  • 2,406
2 votes
2 answers
234 views

What is the difference between using an interpolation in the pandas package and the linear space of the numpy package? For instance in the following: import numpy as np import pandas as pd In [12]: np....
user avatar
0 votes
1 answer
68 views

This is what I have to do The police department has requested the creation of a Storm and Crime Data Report (SCDR) using historical data from the city of Miami for the calendar year 2024. Miami police ...
Shuawna Kang's user avatar
0 votes
1 answer
114 views

I am working on a project where I frequently filter my DataFrame and then return it for further processing. For example, in one file I have this code: df = df[df['Ticker'].str.startswith("...
its m's user avatar
  • 49
0 votes
3 answers
85 views

I have a dataframe where column 1 has values like 'A', 'B' etc (categorical variable) and column 2 (also a nominal categorical variable) has values like 'city 1', 'city 2' etc. I want to group by ...
user9026's user avatar
  • 970
0 votes
0 answers
76 views

I am working on calculating the Extended Internal Rate of Return (XIRR) for a series of irregular cash flows using Python. The cash flows occur on uneven dates, and I want the result to match Excel’s ...
Honeybee Digital's user avatar
-2 votes
1 answer
103 views

I am learning data science and working with JSON data. Suppose I have a JSON file with provinces and their populations, but each province also has a list of cities. For example: When I load this into ...
Sage Yang's user avatar
0 votes
0 answers
27 views

I have a dataset populated from an API call to Splunk. The dataset contains the following: time destip destport transport 2025-09-17 22:03:09 172.16.5.1 53 UDP 2025-09-17 22:03:10 172.16.5.1 53 UDP ...
Jhowel's user avatar
  • 63
2 votes
1 answer
178 views

Using Python, I am trying to copy the contents of an Excel sheet, to another Excel file, with the same sheet name. The Pandas code to write to the destination sheet seems to overwrite the whole ...
Herman's user avatar
  • 171
0 votes
1 answer
124 views

I am using Jupyter Notebook inside an Anaconda virtual environment and I have a notebook with multiple plots using matplotlib and pandas DataFrames with styled formatting. When I try to export the ...
Gouri Phadnis's user avatar
0 votes
0 answers
43 views

I am running tests on a notebook from within Databricks' "Cleanroom" environment (therefore it is run on serverless compute managed by Databricks). Im running into the following timeout ...
sam's user avatar
  • 1
0 votes
1 answer
115 views

Is it possible to overwrite and save a DataFrame to an Excel or CSV file if that file is already open? I tried using df.to_excel("file.xlsx", index=False) to overwrite an existing Excel file ...
Ayuba Ahmed Bayugo's user avatar
2 votes
1 answer
83 views

I have a multi-plot bar plot figure produced with matplotlib/ seaborn and I'd like to control the tick lines and background style. When I try to use sns.set_style("whitegrid"), the ...
Will Hamilton's user avatar
0 votes
0 answers
50 views

I have a dataframe with two duplicated columns, that can be in variable position. In the example, COLUMN1 and COLUMN4 are optionals. COLUMN1 CLASIFICATION CLASIFICATION COLUMN4 or CLASIFICATION ...
Cristian Avendaño's user avatar
0 votes
2 answers
83 views

I am grabbing OHLC stock price data from tiingo using the following code: df = get_tiingo_data(ticker, start_date="1990-01-01", end_date="2025-12-31") Which calls this function: ...
GC123's user avatar
  • 411
1 vote
0 answers
56 views

I'm implementing a bar cutting optimization in Python using the First Fit Decreasing (FFD) algorithm. I have a list of orders, each with a quantity and length. The code should allocate all pieces into ...
Luciano Junior's user avatar
0 votes
1 answer
97 views

I'm trying to get some data with yfinance, but I'm trying to get the timeframe for 2 days, which doesn't exist by default for yfinance. So, I tried to agreggate 1 day + 1 day. But I'm getting an error ...
andrellima's user avatar
1 vote
2 answers
103 views

My goal is to compare the contents of two 2-d arrays (~15k rows, 8 columns) and get a list of the rows/columns where the comparison (greater-than) is true. I need that output in a readable format, ...
BunnyKnitter's user avatar
3 votes
2 answers
114 views

I have a function that receives a pandas dataframe as an argument, plot it, and save the generated figure with the name of passed dataframe. . For instance this is the function: def plot_function(df): ...
amiref's user avatar
  • 3,491
2 votes
1 answer
95 views

CONTEXT Let's say I have df, which consists of a column of delimited strings that I would like to sort. import pandas as pd df = pd.DataFrame({'sort_me':['foo; bar','foo; bar','bar; foo']}) df ...
bismo's user avatar
  • 1,645
4 votes
2 answers
126 views

I was looking to add subtotals to a pandas dataframe - a question which I found to be asked here often. The answers making use of the deprecated pd.append aren't relevant anymore so I figured a more ...
Thomas Petit's user avatar
1 vote
2 answers
114 views

I'm trying to update a DataFrame value in place and unable to do so. I've tried several different methods, none of which have worked. Looking up information online only gets me examples of comparing ...
user23435723's user avatar
2 votes
1 answer
59 views

I've been trying to plot a data frame as a box plot using matplotlib. My data frame looks something like this: 9-1 9-2 9-3 9-4 9-5 0 23 16.0 18.0 18.0 26 1 27 18.0 20.0 17.0 33 ...
Dylan's user avatar
  • 23
0 votes
2 answers
92 views

Scenario: I am trying to merge 2 pandas dataframes. DF1 has the bulk data, and DF2 is a sort of mapping. Based on the combination of the values of 3 different columns, I want to put a column from DF2 ...
DGMS89's user avatar
  • 1,711
0 votes
1 answer
42 views

I am trying to run Python project with Pandas as a dependency, installed with Poetry. name : pandas version : 2.3.2 ...
Mikko Ohtamaa's user avatar
1 vote
1 answer
78 views

In python pandas, I'm wondering if there is a builtin function that does the same as df_to_dict below. Speed is of the essence, as my dataframe can have thousands of rows. Basically return a ...
Faraz Masroor's user avatar
2 votes
1 answer
167 views

New to using Pandera. I want it to print the record(s) that fail the check. This is the simple check I want, fail when the system capacity is over 500: import pandera.pandas as pa import pandas as pd ...
Sam Firke's user avatar
  • 23.4k
4 votes
4 answers
207 views

Following is my data frame. id name class -------------------------- 0 Nick a 1 Jane b 2 Jacon a 3 Jack b 4 Cooze a -----...
user6781's user avatar
  • 621
0 votes
2 answers
120 views

I am trying to create a new column in a pandas dataframe, where the values are based on ranges of the time (hours and minute) of the DatetimeIndex. Here is my dataframe: DatetimeIndex ...
AjWinston's user avatar
  • 129
0 votes
0 answers
65 views

I am working with a DataFrame of almost 1M rows and want to compute a column as a function of two others. My first idea was to use .apply(axis=1) with a lambda function to do the operation, but it was ...
amiref's user avatar
  • 3,491
3 votes
1 answer
73 views

I am trying to create a choropleth of India that shows railway accident data. When I try to make run it a choropleth is created, but the states of India are too small and do not reflect their real ...
Ryan_Brusseau's user avatar
0 votes
1 answer
115 views

i am using a dataset of video game sales and i am trying to make a pie chart of the top 10 publishers based on global_sales right now i have: data['Publisher'].value_counts().head(10).plot.pie(autopct=...
azzy's user avatar
  • 11
0 votes
1 answer
77 views

I have a pandas dataframe agent deployed in an Azure FastAPI app service. agent = create_pandas_dataframe_agent( llm, df, verbose=True, ...
crux's user avatar
  • 63
0 votes
0 answers
31 views

I am using PVLib modeling to estimate soiling for my project. I am checking both HSU and Kimber models to compare and identify the best result. The problem is, I would expect outputs to be somewhat ...
Tina's user avatar
  • 1
2 votes
2 answers
141 views

I am writing a routine to load a large dataset into a Pandas DataFrame from a bespoke text format. As part of this process, I need to add new columns to a DataFrame. Sometimes I need to broadcast a ...
Dan Lenski's user avatar
  • 80.4k
1 vote
2 answers
101 views

I'm performing data validation in Python using the Pandas module. I have two datasets to compare source and target data for expected values. I've successfully merged two dataframes using pd.merge and ...
Cassidy Alexander's user avatar
-4 votes
2 answers
152 views

I have a dataframe with multiple rows that can be combined to a single row. I'm not sure how to do it. Input DataFrame: Emp# Name Week1 Week2 Week3 Week4 1 mary 45 0 0 0 1 mary 0 45 0 0 1 mary 0 0 63 ...
Anupkumar Kasi's user avatar
2 votes
1 answer
78 views

CONTEXT I am NOT trying to round to the nearest 0.5. I know there are questions on here that address that. Rather, I am trying to change the decimal value of each value in each row to 0.5 while ...
bismo's user avatar
  • 1,645
1 vote
2 answers
99 views

I have a dataframe df made up of n columns which are groups and one, "data". This dataframe is then grouped on the n group columns. df = pd.DataFrame(data={"g0": ["foo", ...
Aristide's user avatar
1 vote
0 answers
84 views

I did find a couple of threads related to similar topics, but the use case is always slightly different, as the goal often is just to format the x-axis of a plot or similar. So I am opening this new ...
hugo's user avatar
  • 11
2 votes
1 answer
154 views

Imagine I have this dataframe called temp: temp = pd.DataFrame(index = [x for x in range(0, 10)], columns = list('abcd')) for row in temp.index: temp.loc[row] = default_rng().choice(10, size=4,...
Saeed's user avatar
  • 2,151
-1 votes
3 answers
245 views

I would like to scrape the 2nd table in the page seen below from the link - https://fbref.com/en/comps/9/2023-2024/stats/2023-2024-Premier-League-Stats on google collab. But pd.read_html only gives me ...
rian patel's user avatar
2 votes
1 answer
89 views

Scenario: I have a pandas dataframe. I am trying to use the values in a given column (year) to find the relevant header name and add it to a new column (year_name). For example, if the dataframe looks ...
DGMS89's user avatar
  • 1,711
2 votes
1 answer
270 views

Prerequisites: Python 3.11.7 Pandas 2.3.0 Numpy 2.1.3 Pydantic 2.11.7 In the pandas documentation, it states that missing values for numeric data types are filled in with numpy.nan: https://pandas....
MikeFenton's user avatar
6 votes
2 answers
390 views

I'm new to Python and to Pandas, and I am desperately trying to understand how or why this is happening. I have a CSV file with some data, which has some rows which have extra commas , which are not ...
Cillian Myles's user avatar