310 questions
0
votes
1
answer
55
views
Data transformation in PowerBI for a column where data source is Azure Synapse Analytics SQL
I got my data in PowerBI by connecting to Azure Synapse, where I need to transform a column in PowerBI where the column (text) contains list of numeric values eg. [0,1.0,0.05] in a single row needs to ...
0
votes
1
answer
112
views
Formatting csv file format in pyspark
I have a | delimited csv file with data as shown below.
AccountID|BounceSubcategory|BounceTypeID|BounceType|SMTPBounceReason|SMTPMessage|SMTPCode|TriggererSendDefinitionObjectID|...
2
votes
1
answer
55
views
How to fit scaler for different subsets of rows depending on group variable and include it in a Pipeline?
I have a data set like the following and want to scale the data using any of the scalers in sklearn.preprocessing.
Is there an easy way to fit this scaler not over the whole data set, but per group? ...
0
votes
0
answers
12
views
Handling columns in CSV not known by the schema
I’m new to AWS Glue but want to use it to ingest large amounts of data from a CSV file stored in S3 into a PostgreSQL database.
The data is client provided and can contain a mix of “required” and “...
2
votes
5
answers
103
views
Dropping grouped rows based on a certain hierarchical column
Suppose I have this pandas dataset:
ID
Question Code
1
Q01
1
Q01-1
1
Q02
2
Q01
2
Q02
2
Q02-1
2
Q02-1-1
2
Q02-2
I want to remove the rows based on certain hierarchical conditions between the values of ...
-1
votes
1
answer
72
views
How to Transform a Soccer Match DataFrame to a Long Format with Separate Rows for Home and Away Teams in R [duplicate]
I have a DataFrame in R with the following columns:
season: The season of the match (e.g., "2015/2016")
stage: The stage or round of the match (e.g., 1 for Round 1)
home_team_api_id: The ID ...
-1
votes
1
answer
52
views
Data-transformation with R and dplyr [closed]
I have data in the following format:
Location
Species
Date
Count
Location1
Species1
01-01-2024
2
Location1
Species1
01-02-2024
4
Location1
Species1
01-03-2024
3
Location1
Species2
01-01-2024
6
...
0
votes
2
answers
2k
views
How to Reference Previous Row Value in Power Query for Custom Column Logic?
I’m working in Power Query and trying to create a Custom Column that mimics the following Excel formula, im struggling with mcode:
=IF(A2 > 0, IF(A2 = A1, 0, A2), A2)
The goal is:
If the current ...
0
votes
1
answer
43
views
Error in M language while Calculating Fiscal Quarter
Errors: Expected to find a right parenthesis <')'>, but a keyword <'then'> was found instead
Power Query Editor
I'm trying to find a solution for inserting a new column in Power Query ...
1
vote
1
answer
99
views
Excel how to merge duplicate rows into a single row with additional columns?
I need help formatting my data as shown in the image below. These are only 3 columns, I have so many of these. How can I format this.
Current State
Desired State
For a context, my dataset has 2100 ...
-1
votes
1
answer
290
views
How to Separate Data with Inconsistent Patterns into a Structured Format in Excel
Inconsistent Values in Cells
I'm working with a dataset where multiple values in a cell are tagged under categories like Location, Host, Guest, and Bucket, and separated by line breaks. I need to ...
0
votes
1
answer
101
views
Change x-axis scale to cuberoot without transforming raw data using trans_new() [closed]
I need to change the x-axis of my ggplot figure to a cuberoot scale, without transforming the raw data. My code below had been working but with the new R update, I am getting the error,
Error in if (...
0
votes
0
answers
59
views
PowerBi - Datetime value loses 1 second when changing the field type from DateTime to Text
I have a DateTime value called "Hour". It represents each hour (rounded) of the day.
Due to other transformations I want to apply, I need to change the field's type to Text.
However, when I ...
0
votes
2
answers
47
views
How do I create a new column in my DF of daily measurements that gives me the increase between today's and yesterday's measurement?
I have a column of datapoints for daily measurements in my DF. I would like to add a new column to said DF that tells me the increase or decrease of this measurement in comparison to yesterday's.
...
0
votes
1
answer
46
views
Trying to convert time format 0:00:00 to seconds (integer)
Trying to convert time format 0:00:00 to seconds (integer), to use as a derive column in SSIS. I've tried (DT_I4)TOKEN(column,":",1) * 60 + (DT_I4)TOKEN(column,":",2), but it is ...
2
votes
0
answers
2k
views
Error: dbt configuration is invalid : Parsing Error Env var required but not provided:
I have been facing this issue in dbt core while using the DBT Power user extension in VSCode.
Is there any way to tackle this error? I have set the environmental variable through the command prompt. ...
3
votes
1
answer
57
views
Using SQL, is there a way to transform a dataset that makes new columns for data points within the table?
I've been attempting to clean a dataset for a project, but ran into issues when trying to transform the data. The provided problem set contains a similar scenario to the real dataset. If you look ...
-1
votes
1
answer
73
views
Transform a nested XML array of keys and values to a single row in Azure Data Factory
I'm creating a data flow to write XML data to a SQL table for invoice line items. I have XML source data (edited) as follows and after flattening, I have isolated the "data" elements (under ...
0
votes
2
answers
37
views
Modifying the column of a dataframe based in some rules
Suppose that I have a data frame as follows:
idp<-sort(rep(c("A","B","C","D"),10))
a1<-c(1,1,1,2,3,4,3,4,2,2)
a2<-c(3,3,NA,NA,4,1,2,3,1,1)
a3<-c(NA,...
0
votes
1
answer
54
views
Transforming Data from columns to Row Based on a particular Column and group in Pandas
Here is the sample Data
data = {'Game': ['NFS', 'Forza', 'Wreckfest','Dirt Rally','Burnout','Project Cars','Grid 2','GTA','Saints Row','Persona 5','COD','Battlefield','Counter Strike'],
'Game ...
0
votes
1
answer
74
views
Make number representation selectable in matrix visual for specific values in Power BI
I have the following report.
Here, the user has the option to select different values, which are then displayed in the matrix visual. Since the numbers are very large, I want the user to have the ...
0
votes
0
answers
47
views
I want to group the data and transform it into different format in mogodb
This is my goal schem that store the goal
const mongoose = require("mongoose");
const Schema = mongoose.Schema;
const userSchema = new Schema(
{
email: {
type: String,
...
0
votes
1
answer
119
views
r randomly assign 1 or 0 based on conditons
For a dataset like this
MainID SubID DOB BMI
1234 1234_A Feb-19-2024 10.1
1235 1235_A Jan-11-2023 17.23
1235 1235_B Jan-11-2023 19....
0
votes
1
answer
60
views
How do I aggregate an array of maps in Java Spark
I have a dataset "events" that includes an array of maps. I want to turn it into one map which is the aggregation of the amounts and counts
Currently, I'm running the following statement:
...
0
votes
1
answer
2k
views
Calculate a difference measure in Power BI based on two selected dates
At the moment I'm calculating a measure, which desribes the value-difference between a report date and the day before the selected report date. That means, if a user select today, then it shows the ...
1
vote
1
answer
143
views
Dynamic year-values when creating a customized column in power bi depending on the actual year
I want to aggregate three timebuckets to one. At the moment I create a customized column like this:
timebucketNEW = IF([timebucket] IN {"YtD", "BoY", "2023"}, "2024&...
1
vote
1
answer
73
views
Nested loop to several arrays with jolt library
In input, I have a list of persons. Each person has several addresses and roles on contract.
Example :
Person #001 has :
two addresses secondary residence and tax address ;
two roles co-subscriber ...
1
vote
2
answers
68
views
r create a new Indicator column based on combination of variables
I have a dataset with two colums, ID, Group.
ID Group
10 Red
11 Red
13 Blue
13 Red
15 Blue
15 Blue
17 Blue
17 Red
17 Red
19 Blue
19 ...
0
votes
1
answer
203
views
Data Transformation DSL [closed]
apologies if this question is too vague.
I'm involved in developing a system that primarily transforms the data from one format to another.
These are pretty much XML --> XML, XML <--> JSON, ...
4
votes
5
answers
203
views
extract integers from characters in R
I am in R. I want to extract just the numbers from df1.
I have for example:
df1 <- data.frame( column1 = c("Any[12, 15, 20]", "Any[22, 23, 30]"), column2 = c("Any[4, 17]&...
0
votes
1
answer
349
views
Neither PowerTransformer nor QuantileTransformer is working for a feature
I am working on famous bike-sharing dataset. The distribution of target variabe cnt is heavily skewed. I am sharing its distribution as well for my readers' convenience.
And I wanted to transform it ...
1
vote
2
answers
61
views
Collapse a dataframe based on virtual grid
I am a beginner in pandas and need some help
I have the following dummy data
raw_data = {
"Unnamed: 0" : ["Index_with_NaNs", 1., np.nan, 2., np.nan, np.nan, 3., np.nan, np.nan, ...
0
votes
1
answer
944
views
Join multiple tables in Azure Data Factory to create a structure with an embedded array field?
I am trying to set up an Azure Data Factory transformation. I have a SQL Server database with three tables: Students, StudentClasses and Classes. I would like to use Azure Data Factory to read these ...
0
votes
1
answer
92
views
r retain non zero rows in case of duplicates
I have a dataset with duplicate rows if we group by two columns
ID Group Value
1 z1 0
1 z1 0.81
2 z2 2.89
2 z2 1.53
3 z1 ...
0
votes
1
answer
43
views
r retain duplicates after group by not min value
I have a dataset like this.
ID Group Value Col3
1 z1 1.29 1
1 z1 0.81 1
2 z2 2.89 1
2 z2 1.53 2
3 z1 0.13 ...
0
votes
1
answer
57
views
r generate indicator columns based on conditions
Suppose this is my dataset
State Enter Event Lag
State-A 2000 2004 -4
State-A 2001 2004 -3
State-A 2002 2004 -2
State-A 2003 2004 -1
State-A 2004 2004 0
State-A 2005 ...
1
vote
3
answers
168
views
How to create new data rows for each unique value in a column (potentially grouped by other variables) in R
I have data that has the following structure:
data <- data.frame(
uniqueid = c(1, 1, 2, 2, 3, 3),
year = c(2010, 2011, 2010, 2011, 2010, 2011),
agency = c("SZ", "SZ", &...
0
votes
1
answer
841
views
Loading File into DuckDB using Python Fails Die to "/N" Values Used to Represent Nulls
I'm trying to load a csv into Python, but the file keeps failing because one of the fields has a '\N' to represent null values in a field that is Integer. I can't figure out how to deal with this - I'...
0
votes
0
answers
101
views
Data Transformation Issue on End-to-End ML Project - KeyError: 'Date_of_Journey'
I am following the process shown on Wine Quality Prediction End-to-End ML Project on Krish Naik's YouTube channel to do a Flight Fare Prediction Project.
I run this cell of data transformation ...
0
votes
1
answer
49
views
Data Transformation Issue on End-to-End ML Project - convert_to_minutes() Takes 1 Positional Argument But 2 Were Given
I am following the process shown on Wine Quality Prediction End-to-End ML Project on Krish Naik's YouTube channel to do a Flight Fare Prediction Project.
I run this cell of data transformation ...
0
votes
0
answers
56
views
Data Transformation Pipeline Issue on End-to-End Machine Learning Project
I am following the process shown on Wine Quality Prediction End-to-End ML Project on Krish Naik's YouTube channel to do a Flight Fare Prediction Project.
I am facing an issue with Data Transformation ...
0
votes
1
answer
107
views
How to transpose columns into rows and ensure repetition of rows accordingly in excel?
How to transpose the columns and ensure that rows are repeated accordingly?
Dataset has the following data :-
Date
Year
Month
Day
USD
EUR
JPY
1/1/1994
1994
1
1
10
20
5
1/1/1995
1995
1
1
12
30
10
The ...
1
vote
1
answer
96
views
How to use successfully min_rank() in a pipe?
I am currently working on solving exercises from R for Data Science book but I am struggling with Data transformation.
I want to create a new column in my dataframe 'temp' that ranks each 'carrier' ...
0
votes
1
answer
28
views
Azure Data Transformation Logic
Is there no way to add an additional column at the Sink and create a dynamic expression? This will be a derived column whose value depends on some joins from source tables.
I have tried using data ...
0
votes
1
answer
230
views
Creating an additional column in sink using logic that involves three different columns from three different source tables respectively
What will be the steps and expression for derived column if the logic for populating the column is as below:
The data mapping is using data factory:
select c.ClaimID, pl. Related StoreNumber as ...
-1
votes
1
answer
519
views
To find count of current employees in power bi using dax
i have a employee table like this.
emp table
My requirement is to count current employees in the orgnisation using dax.
In the above table for employee id 1001 there were 3 rows of the data(i.e ...
1
vote
0
answers
644
views
Create hierarchy slicer in power bi using parent -child id from same column
I have a sample data as follows
I need to create a hierarchical slicer using the name field based on the parent id and id
id
parent Id
name
1
null
A
2
null
B
3
1
a1
4
1
a2
5
3
a1-1
6
3
a1-2
7
4
a2-1
8
...
1
vote
1
answer
267
views
How to convert a WeakMap to a JSON Object in Javascript?
Given a JS WeakMap, I want to convert this to JSON data.
const weakmap = new WeakMap() // Convert weakmap to JSON data / JS object
I've tried JSON.parse(JSON.stringify(weakmap)) which returns empty ...
0
votes
2
answers
27
views
Jmeter and DFE extensions
I have a project, where we are using LoadRunner and an DFE extension.
My question is it possible to migrate the project from LR to jMeter, without changing the DFE extension.
In general what it does ...
0
votes
1
answer
380
views
ADF Data flow expression
I am trying to build ADF data flow select operation to dynamically select column names.
I am receiving required column names in an array parameter named 'colNames' and then I am trying to use that in ...