So I'm trying to test if there is statistically significant difference between the frequency of...
So I'm trying to test if there is statistically significant difference between the frequency of play of different recreational soccer players based on their stated reasons for play.
So for example: The average amount of times someone plays soccer per week when they state that exercise is their reason for play - compared to the frequency of someone who might say the social aspect is their reason. How might I set up such a hypothesis test.
Answer & Explanation
There are vague aspects to your question. Here are some suggestions, considerations, and one possibly relevant statistical procedure.
You should begin by specifying the population of occasional soccer players in which you are interested. Perhaps undergraduates at a particular university or in a particular region. What is your definition of 'recreational soccer player'? In a diverse student population, are you going to set age limits? Will you include women?
Sample a number people from this population. Use the answer to the motivational question so sort them into groups E (exercise) or S (social). Then determine the 'typical' number of times per week each subject plays soccer. You need to be clear what you mean by 'typical' and 'playing soccer'. (Is it better to ask for the typical number of hours a week spent playing soccer?)
Do a two-sample t test ('Welch', 'non-pooled', or 'separate variances') to see if the sample means for the two groups indicate that the population means for E and S differ. (You will also need the sample standard deviations of the two groups.)
Consult a basic statistics text for the formula of the test statistic and how to decide whether to reject the hypothesis that groups E and S have equal population means. Most statistical software packages will perform such a test.
You should look at the data to see whether the numerical values within E and S are consistent with normal. (If not, other kinds of 2-sample tests are available.)
In the end, if you do not find a statistically significant difference, you might do a power computation to find out whether you have enough subjects in each group to have a reasonable chance of detecting a difference of a particular size. (Ideally, this sample-size determination would be done in advance of your 'study', but you probably wouldn't know enough without a pilot run to answer the necessary questions.)