2

I have a data frame named 'sal' that contains salary information for employees across a number of years.

I am trying to calculate the number of job titles that were represented by only one person, in the year 2013. I know, via a manual check the answer to this is 202.

I'm using the following method:

sal[sal['Year'] == 2013]['JobTitle'].nunique()

Data Sample:

    Id  EmployeeName    JobTitle    BasePay OvertimePay OtherPay    Benefits    TotalPay    TotalPayBenefits    Year    Notes   Agency  Status
72926   Gregory P Suhr  Chief of Police 319275.01   0   20007.06    86533.21    339282.07   425815.28   2013        San Francisco   
72927   Joanne M Hayes-White    Chief, Fire Department  313686.01   0   23236   85431.39    336922.01   422353.4    2013        San Francisco   
72928   Samson  Lai Battalion Chief, Fire Suppress  186236.42   131217.63   29648.27    57064.95    347102.32   404167.27   2013        San Francisco   
72929   Ellen G Moffatt Asst Med Examiner   272855.51   23727.91    38954.54    66198.92    335537.96   401736.88   2013        San Francisco   
72930   Robert L Shaw   Dep Dir for Investments, Ret    315572.01   0   0   82849.66    315572.01   398421.67   2013        San Francisco   
72931   David L Franklin    Asst Chf of Dept (Fire Dept)    215265.6    87985.24    30637.48    62890.36    333888.32   396778.68   2013        San Francisco   
72932   Harlan L Kelly-Jr   Executive Contract Employee 313312.52   0   0   82319.51    313312.52   395632.03   2013        San Francisco   
72933   John L Martin   Dept Head V 311758.96   0   1098.64 82476.85    312857.6    395334.45   2013        San Francisco   
72934   Edward D Reiskin    Gen Mgr, Public Trnsp Dept  305307.89   0   0   80860.6 305307.89   386168.49   2013        San Francisco   
72935   Thomas A Siragusa   Asst Chf of Dept (Fire Dept)    215265.6    88028.54    21526.49    61288.58    324820.63   386109.21   2013        San Francisco   
72936   Amy P Hart  Dept Head V 286480.44   0   17188.71    80077.63    303669.15   383746.78   2013        San Francisco   
72937   Yifang  Qian    Senior Physician Specialist 203710  0   119176.84   58810.96    322886.84   381697.8    2013        San Francisco   
72938   Michael J Biel  Deputy Chief 3  278964  0   17587.86    77708.48    296551.86   374260.34   2013        San Francisco   
72939   Raymond A Guzman    Dep Chf of Dept (Fire Dept) 270756.03   0   24181.02    77474.92    294937.05   372411.97   2013        San Francisco   
72940   Marty A Ross    Battalion Chief, Fire Suppress  186236.43   88345.08    38035.09    58991.75    312616.6    371608.35   2013        San Francisco   
72941   Mark A Gonzales Dep Chf of Dept (Fire Dept) 270756.01   0   20236.5 77408.16    290992.51   368400.67   2013        San Francisco   
72942   Mark J Johnson  Battalion Chief, Fire Suppress  186236.41   101466.96   23994.92    56134.3 311698.29   367832.59   2013        San Francisco   
72943   Bryan W Rubenstein  Battalion Chief, Fire Suppress  186236.45   94450.92    30313.49    56508.46    311000.86   367509.32   2013        San Francisco   
72944   Gary L Altenberg    Lieutenant, Fire Suppression    135903.02   163477.81   20994.96    46030.76    320375.79   366406.55   2013        San Francisco   
72945   John J Loftus   Deputy Chief 3  274126.5    0   13358.1 75909.1 287484.6    363393.7    2013        San Francisco   
72946   Edwin M Lee Mayor   285446.37   0   0   77105.29    285446.37   362551.66   2013        San Francisco   
72947   Michael J Morris    Assistant Deputy Chief 2    124054  0   202322.37   35929.84    326376.37   362306.21   2013        San Francisco   
72948   David  Shinn    Deputy Chief 3  278964  0   6428.79 76680.57    285392.79   362073.36   2013        San Francisco   
72949   Arthur W Kenney Asst Chf of Dept (Fire Dept)    213308.64   49139.25    36262.42    60756.95    298710.31   359467.26   2013        San Francisco   
72950   Lorrie A Kalos  Battalion Chief, Fire Suppress  186236.49   87457.68    28003.53    57030.95    301697.7    358728.65   2013        San Francisco   
72951   Lyn  Tomioka    Deputy Chief 3  278964  0   3536.35 76113.13    282500.35   358613.48   2013        San Francisco   
72952   Denise A Schmitt    Deputy Chief 3  278964  0   3536.39 75367.15    282500.39   357867.54   2013        San Francisco   
72953   Rudy J Castellanos  Battalion Chief, Fire Suppress  186236.42   94274.25    19022.95    55351.53    299533.62   354885.15   2013        San Francisco   
72954   Susan  Currin   Adm, SFGH Medical Center    271831.5    0   5000    75511.72    276831.5    352343.22   2013        San Francisco   
72955   Thomas F Abbott Battalion Chief, Fire Suppress  186236.41   84382.38    23279.44    56184.01    293898.23   350082.24   2013        San Francisco   
72956   Naomi M Kelly   Dept Head V 270641.5    0   3000    74867.87    273641.5    348509.37   2013        San Francisco   
72957   Trent E Rhorer  Dept Head V 270641.56   0   3000    74769.34    273641.56   348410.9    2013        San Francisco   
72958   Barbara A Garcia    Dept Head V 270591.04   0   3050.5  74769.33    273641.54   348410.87   2013        San Francisco   
72959   Robert F Postel Asst Chf of Dept (Fire Dept)    212244.54   62490.6 13450.16    58778.57    288185.3    346963.87   2013        San Francisco   
72960   Jeffrey J Barden    Captain, Fire Suppression   155174.49   124293.83   18151.93    49001.55    297620.25   346621.8    2013        San Francisco   

which is returing an incorrect answer of 1051. Could someone explain why the logic I have used is incorrect and an alternate method?

Thanks!!!

4
  • Can you provide some sample data? One thought may be some job titles are capitalized or contain trailing spaces which may be inflating your numbers Commented Apr 2, 2020 at 2:34
  • Hi @bbd108 I've added some sample data to the post, hope that helps! Commented Apr 2, 2020 at 2:44
  • 1
    ['JobTitle'].nunique() gives you the number of unique job titles, that is 2 truck drivers, 3 doctors, 2 nurses would give you 3 job titles. It does not give you the number of employees with a unique job title. Commented Apr 2, 2020 at 2:46
  • Thanks @QuangHoang as you've pointed out my logic wasn't quite right! Commented Apr 2, 2020 at 2:59

1 Answer 1

2

So to answer the question I had my logic wrong:

sal[sal['Year'] == 2013]['JobTitle'].nunique()

will count the number of unique job titles. So if there are 10 people with the job title 'Engineer' it will only count once.

The answer I was looking for was 'the number of job titles that were represented by only one person';

which I found using the solution:

 sum(sal[sal['Year']==2013]['JobTitle'].value_counts()==1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.