I have two csv files. Contacts and Users.
How I load data into dataframes and merge them
First, I load a dataframe with the name of the users:
import pandas as pd
import numpy as np
df_users= pd.read_csv('./Users_001.csv',sep=',',usecols=[0,2,3])
Then I load the information from contacts of each user
df_contacts = pd.read_csv('./Contacts_001.csv',sep=',',usecols=[0,1,5,48,55,56,57,83,58])
df_users columns name are: user_id, Name, Surname
df_contacts columns name are: Contact ID, id user owner, fullname, qualification, ...
I want to merge both dataframes using user_id and 'id user owner' since they represent the same information. To to this I first change the name of the columns on df_contacts and then I merge
dfcontactos.columns = ['ID de Contacto','user_id','fullname','qualification','accesibility' ... ]
df_us_cont = pd.merge(dfcontactos,df_usuarios,on='user_id')
Now df_us_cont has the information from users and contacts.
What I want to do
There are only 18 user_id but there are 500 contacts. For each user I want to know:
Number of contacts with qualification < 100
For the contacts that have qualification <100
How many contacts have accesibility >= 4
Accesibility is a discrete number (0-5))
- Number of contacts with qualification > 100 and < 300
- Number of contacts with qualification > 300 -
What I have tried and fail
df_qua_lower100 = df_us_cont[df_us_cont['qualification']<100]
df_qua_lower100['user_id'].value_counts()
So far with this I am able to get the information on how many contacts with qualification<100 has each user_id. But I am unable to look how many have 'accesibility>=4'
I have tried to explain the best I could