Skip to main content
Filter by
Sorted by
Tagged with
-2 votes
0 answers
108 views

I need some help in my coding related to Basic Data Wrangling. The instructions for coding in R are as follows: Let's say you want to live in the Northeast or West in US and you want the homicide rate ...
nocturne-oz's user avatar
12 votes
0 answers
326 views

It is a while that I am using Data Wrangler extension in VS Code; it is very useful for analyzing datasets and filtering some columns to see the features. When I opened a dataframe in it, it used to ...
Javad Faraji's user avatar
1 vote
1 answer
109 views

Given two polars dataframes of the same shape, I would like to print the number of values different between the two, including missing values that are not missing in the other dataframe. I came up ...
robertspierre's user avatar
1 vote
3 answers
100 views

When you join two tables, STATA prints the number of rows merged and unmerged. For instance, take Example 1 at page 13 of the STATA merge doc: use https://www.stata-press.com/data/r19/autosize merge 1:...
robertspierre's user avatar
-3 votes
0 answers
45 views

Suppose I have df <- data.frame(name=c("Hello", "Hi", "GoodMorning")) I would like to convert "GoodMorning" into "GoodEvening" (of course this is ...
robertspierre's user avatar
0 votes
2 answers
119 views

I have the following dataframe: df <- tribble( ~nuts_code, ~value, "AT", 1, "AT1", NA, "AT2", NA, "BG", NA, "BG1", 10, "BG2"...
robertspierre's user avatar
1 vote
4 answers
190 views

I have the following tibble: eu_df <- structure( list( nuts_code = c( "PT17", "PT17", "PT17", "PT17", "PT17", "PT17", "PT17", &...
robertspierre's user avatar
0 votes
1 answer
82 views

I have a dataset with 5 variables. Each variable is a name of a fruit, with 0 (don't like), and 1 (like). The data frame is like this: set.seed(225) fruits<-data.frame(id= seq(1:10),Apple=...
Ali's user avatar
  • 33
4 votes
2 answers
175 views

I have a messy data set, which generally resembles the output of the following schools_messy <- tibble::tribble( ~data, "state:maryland", "location:bowie||name:bowie state ...
David Robie's user avatar
1 vote
0 answers
115 views

I've been using Data wrangler in Visual code to visualize dataframe. Normally Open df in Data Wrangler will open a separate tab. One day Data Wrangler view happened to be displayed inline like this (...
Kenny's user avatar
  • 2,022
1 vote
2 answers
70 views

I present here an input data frame that contains lists of data frames that contains lists. Some of the bottom level lists are empty and some lists have length greater than one. I am looking for some R ...
Nevil's user avatar
  • 171
-1 votes
1 answer
128 views

Let's say I have a time series of behaviours. It contains the timing and identity of people who have performed a particular behaviour. I want to list all the people who performed the behaviour within ...
KrisAnathema's user avatar
1 vote
2 answers
94 views

I'm trying to group data into a grouping variable based on whether or not there is data in specific columns. In other words, if there is data in the same row for V1 & V2 below, then I want to put ...
HelplessStatistician's user avatar
0 votes
1 answer
62 views

I have a dataframe that looks like this: Family Order Class Presence Year Site Location Lat Long Aeshnidae Odonata Insecta 0 2021 KAV01 NASS -17.4 18.5 Aeshnidae Odonata Insecta 0 2023 KAV01 NASS -17....
Daniel Estévez's user avatar
1 vote
1 answer
42 views

Using R, I am trying to partially fill in a dataframe (~200 rows) using another (~170) rows by matching on an ID variable. Roughly 50% of the IDs match, and I'd like to just leave the other values ...
RobNewToR's user avatar
0 votes
0 answers
53 views

I have a dataframe that looks something like this: Name age score year state Tim 65 123 2016 KS Tom 72 476 2016 OH Larry 58 354 2016 NS Dave 81 878 2017 KS Rob 66 1123 2017 OH Sam 32 45 2017 OH Jeff ...
lwe's user avatar
  • 401
1 vote
1 answer
53 views

I would like to run iterations of a single model, substituting one of a set of 34 different response variables in each iteration, and organize the results (from summary()) of all of those models into ...
JKO's user avatar
  • 307
1 vote
2 answers
77 views

I have a dataset that looks something like this: name party count year likes retweet Tom R 1 2016 1357 23 Dave R 1 2016 1881 34 Larry D 1 2016 324 45 Tim D 1 2016 5587 56 Rob R 1 2016 9847 67 Sam D 1 ...
lwe's user avatar
  • 401
3 votes
2 answers
180 views

I am trying to load into r and wrangle data from the CPS (Current Population Survey) which can be downloaded at this link. There is an ostensible codebook for the information on variables and the ...
flâneur's user avatar
  • 321
0 votes
1 answer
99 views

So I am practicing data wrangling and I have encountered an issue. food['GPA'].unique() And the output is array(['2.4', '3.654', '3.3', '3.2', '3.5', '2.25', '3.8', '3.904', '3.4', '3.6', '3.1'...
Sjaikisan's user avatar
0 votes
1 answer
33 views

I'm trying to use pivot_longer() to rearrange a dataset I was given, which looks like the result of a database join operation. Here's an example of what it looks like: dat <- tibble('Plant_Name'=c('...
S. Robinson's user avatar
1 vote
2 answers
64 views

I have this dataframe. import pandas as pd x = { "year": ["2012", "2012", "2013", "2014", "2012", "2014", "2013", &...
lokalhangatt's user avatar
1 vote
2 answers
64 views

I have the following URLS: www.google.com?utm_source=site_corriere&utm_medium=video&utm_content=box www.google.com?utm_source=site_rep&utm_medium=display&utm_content=box www.google.com?...
paolotroia's user avatar
0 votes
4 answers
98 views

I have the following input table: input <- structure( list(individual = c(1, 2, 3, 4), age = c(20, 34, 29, 30), earnings_2020 = c(0, 0, 1, 0),...
Chloe's user avatar
  • 1
4 votes
3 answers
123 views

I have some trouble with which.min function inside a dplyr pipe I have a cumbersome solution (*) and I'm looking form more compact and elegant way to do this reproducible example library(dplyr) ...
Wael's user avatar
  • 1,808
2 votes
2 answers
74 views

So, I am using constituency data of the German Election 1994 and some observations contain strings that indicate that the value is given in a different row (based on the Scheme "siehe Wkr xxx&...
Paul-Markus Rudolf's user avatar
0 votes
1 answer
39 views

I'm a little perplex concerning the exact way to proceed with this wrangling procedure. I've a dataset which consist in raters that are assessing lung sounds (S1,...,S40). For each sound the assessed ...
Buczinski's user avatar
3 votes
1 answer
86 views

The below examples demonstrate that passing an object to deparse() and substitute() produces different output depending on whether the object is passed to the function with %>% and whether the ...
socialscientist's user avatar
0 votes
2 answers
86 views

I am trying to load the text from a pdf into R for text analysis. The pdf is formatted so that the text has columns for extra information. Please see the screen shot below. I'd like to load the main ...
Ashley Wu's user avatar
0 votes
1 answer
104 views

Question I have a data frame in R where each row contains multiple columns with categorical values. My goal is to rearrange the values within each row so that no value is repeated across columns in ...
Ruam Pimentel's user avatar
1 vote
1 answer
142 views

I have the following long format data frame with columns, id, age, and BMI. I have restricted the dataset such that only people (id) with at least 3 repeated measurements between age 2 weeks and 10 ...
aelhak's user avatar
  • 435
-3 votes
1 answer
46 views

The code below works perfectly fine and outputs the data of interest. However, I am wondering if there is a better solution or different way think about the logic. Essentially, I need filter for the ...
Eizy's user avatar
  • 371
0 votes
1 answer
41 views

I tried to download a .xlsx file from my course. But when I opened the .xlsx file, it turned into something like this. UEsDBBQABgAIAAAAIQBBN4LPbgEAAAQFAAATAAgCW0NvbnRlbnRfVHlwZXNdLnhtbCCiBAIooAAC ...
lokalhangatt's user avatar
0 votes
1 answer
59 views

Based on the data below, I want to calculate the BMI Index for each row and the average for the total row. The BMI Index formula is 'berat' / 'tinggi'. enter image description here data = [{'nama': '...
lokalhangatt's user avatar
0 votes
1 answer
47 views

I have received a dataset in a .csv table. The first three lines of the table looks like this: Species,Methods Chlamydomonas pisiformis; Stichococcus bacillaris; Stichococcus subtilis; Pleurococcus ...
Ginko-Mitten's user avatar
0 votes
0 answers
39 views

I am training a classifier. My data comes from multiple datasets, each dataset contains multiple subjects, each subject has performed multiple trials. Currently my data structure on disk looks like ...
Samuel's user avatar
  • 57
1 vote
1 answer
61 views

I have a dataset in R tidyverse and I want to create 192 columns based on comparison with the sp column, just like the mp_comp_1 column. How can I do this for 192 columns in tidyverse? library(...
Hamideh's user avatar
  • 697
1 vote
2 answers
79 views

I am having some trouble conducting pattern matching within a data frame. I am working with grepl function in R. I have a data frame of 5 local districts in two years (2001 and 2002). I want to check ...
YouLocalRUser's user avatar
2 votes
3 answers
101 views

I have a dataframe of county executives and the year they were inaugurated. I am running a panel study with county-year as the unit of analysis. The date range is 2000 to 2004. I will like to expand ...
YouLocalRUser's user avatar
-1 votes
3 answers
253 views

I am working with a dataframe on county executives. I want to run a panel study where the unit of analysis is the county-year. The problem is that sometimes two or more county executives serve during ...
YouLocalRUser's user avatar
-1 votes
1 answer
42 views

I have a data frame of county executives and the year they were inaugurated. I am runnig a panel study with county-year as the unit of analyis. The date range is 2000 to 2004. I will like to expand ...
YouLocalRUser's user avatar
1 vote
2 answers
67 views

I have a dataset on county executives and their year of inaguration. I need break down which year each executive was inaugurated. The problem is that the notation under the "year" variable ...
YouLocalRUser's user avatar
1 vote
3 answers
151 views

I have a dataframe where missingness in indicated by "Z" (there may also be some "z" and NA entries present in the data), and values are entered as characters ("0", "...
jbmchls's user avatar
  • 13
1 vote
3 answers
50 views

I have a large data frame with repeated variables. This is just a sample of my data to illustrate the question: df <- data.frame( ID = rep(1:4, each = 1), CMW = rep(c(10, 20, 30, 30), each = 1),...
Raquel Feltrin's user avatar
-1 votes
1 answer
55 views

I'm quite new to programmin language and I am starting with R in my research predicting dengue desease cases with climatic data. I'm still cleaning my data to work with and this particular one has ...
André Ferrari's user avatar
0 votes
1 answer
50 views

I am trying to add a column to a data frame (df1) from another data frame (df2), but only when the "depth range" from df1 lies within the "depth range" from df2. I'll explain below ...
Chris Wheeler's user avatar
0 votes
1 answer
66 views

The below code (Databricks SQL) produces the table following it. I am trying to adjust this code so that the output only includes zip5 records that have only 1 (or less) of each facility_type ...
Dr.Data's user avatar
  • 191
1 vote
1 answer
51 views

I have two datasets as the ones described below: dfA <- tibble( name = c("John", "Michael", "Brian", "Thomas", "Peter"), expected = c(128.34, ...
jpm92's user avatar
  • 162
0 votes
0 answers
76 views

I'm working on an assignment and we were asked to load the data and make the file run without errors when opening from the teacher's computer. He said: "When writing your code, keep the data ...
Ashraf Taha's user avatar
0 votes
2 answers
115 views

In this solution to removing HTML tags from a string, the string is passed to rvest::read_html() to create an html_document object and then the object is passed to rvest::html_text() to return "...
socialscientist's user avatar

1
2 3 4 5
27