0

I want to create a dataframe using pandas where 1 column is 'EmployeeID' and the second one is 'skill' set he has ranging form 1 to 5. The 'EmployeeID' column should have unique values whereas the 'skill' column can have repetitive values. 1. I tried to generate the 'EmployeeID' using the below code:

    df = pd.DataFrame({'EmployeeID':[random.sample(range(123456,135000),100)]})

but the result is not what i expected. It generated all the numbers and put them in one row

enter image description here

  1. Random.sample is giving me unique values. How can i generate 100 repetitive values in a given range? Tried using randint but it doesn't have the option of passing the count of numbers to generate
5
  • What is that you expect Commented Oct 28, 2017 at 14:23
  • 1
    Use np.random.randint: pd.DataFrame({'EmployeeID': np.random.randint(123456, 135000, 100)}) Commented Oct 28, 2017 at 14:24
  • Do not use a list since random is already iterable Commented Oct 28, 2017 at 14:24
  • Ok. Thats the mistake i was doing by using a list. That clears my first query. How about my second query. Do i have to write a for loop? Commented Oct 28, 2017 at 14:27
  • Please try and explain your problem more clearly in future. How can i generate 100 repetitive values in a given range? does NOT attempt to explain clearly what your problem is and what you want. Some expected output would also help. Sure, this time there was someone ready to pander to your needs but that won't happen always. Please keep in mind that you are leaving behind a digital carbon footprint for future generations to come across your question should they have the same problem, so don't disappoint. Commented Oct 28, 2017 at 14:41

1 Answer 1

1

Use numpy.random.randint + numpy.tile if need repeat 1-5 range:

df = pd.DataFrame({'EmployeeID': np.random.randint(123456, 135000, 100),
                   'skill':np.tile(np.arange(1,6), 20)})
print (df.head(10))
   EmployeeID  skill
0      129323      1
1      126570      2
2      124034      3
3      129659      4
4      125654      5
5      127093      1
6      123780      2
7      125665      3
8      124063      4
9      125061      5

Also if need random values in range 1-5 for column skill use double randint:

df = pd.DataFrame({'EmployeeID': np.random.randint(123456, 135000, 100),
                   'skill':np.random.randint(1,6, 100)})
print (df.head(10))
   EmployeeID  skill
0      131496      2
1      133133      4
2      130999      2
3      127685      5
4      129008      1
5      124238      3
6      124147      3
7      123592      3
8      133859      1
9      126097      3
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.