Statistical analysis of study with categorical and numerical variables I am researching the effect of a certain innovation type on firm performance. The innovation type is measured through a 6-item survey with nominal answers (yes/no; 1/0) and is retrospective (e.g. Did you introduce XY in the last 5 years). For firm performance I have financial data for the 5 year period I'm interested in. Now, there are two possible approaches I could take: 1) I compute an "innovation" variable from the survey answers, to distinguish between adopters and non-adopters and I examine whether the adopter-group shows to have better firm performance than the non-adopter group. Which type of analysis would this be? And how would I control for firm size and time effects? 2) I investigate whether firms that answe

beobachtereb

beobachtereb

Answered question

2022-09-18

Statistical analysis of study with categorical and numerical variables
I am researching the effect of a certain innovation type on firm performance. The innovation type is measured through a 6-item survey with nominal answers (yes/no; 1/0) and is retrospective (e.g. Did you introduce XY in the last 5 years). For firm performance I have financial data for the 5 year period I'm interested in. Now, there are two possible approaches I could take:
1) I compute an "innovation" variable from the survey answers, to distinguish between adopters and non-adopters and I examine whether the adopter-group shows to have better firm performance than the non-adopter group. Which type of analysis would this be? And how would I control for firm size and time effects?
2) I investigate whether firms that answered more questions with "yes" perform better than firms that answered with fewer "yes". Which type of analysis would this then be? Regression?

Answer & Explanation

Kellen Blackburn

Kellen Blackburn

Beginner2022-09-19Added 8 answers

Variable 1: Adopter status. It seems OK to define Adopter and Non-Adopter firms in terms of the survey. That's a categorical variable with two levels.
Variable 2: Performance. If Performance is measured in terms of financial data, that would start as a numerical variable. You might use percentage increase to scale this to size. Controlling for time is more difficult, because your questionnaire asks 'within the last five years' and that might not be a good match for your 'five years' of financial data.
Possible tests. Maybe start by making plots of raw increases over five years for the two Adoption groups separately. (i) Look to see whether data seem reasonable for a standard two-sample t test (roughly normal, not horribly skewed toward high values, free of outrageous outliers; variances within the same order of magnitude). If a t test seems OK and shows a significant difference, then you have something to talk about. (ii) If a t test is not OK, maybe try taking logs of financial data, and have a second look. Maybe use a nonparametric test of differences in medians, such as Wilcoxon rank sum test. (iii) If numerical data do not lead to a good comparison, then try defining several (maybe three) performance categories: e.g., 'Losers', 'Plodders', 'Winners'. Then do a chi-squared test of independence in the 2×3 contingency table.
Ethical issues: disclosure and reproducablity. This stepwise approach has some potential issues of statistical ethics. It could be claimed that you are tailoring the method of analysis just to find something that produces a significant result. (That is why in clinical drug trials the FDA insists on an up-front protocol with details of subject selection and treatment, and details of statistical analysis.) If this turns in to a publishable paper you would be obligated to mention (if only in footnotes) the exploratory path that led to the reported analysis. That way someone trying to replicate or extend your work would know in advance how to design the study.
Planning ahead. Of course, this might have gone better if you had given thought to the comparability issues when designing your survey questions. Rules for an effective survey are based on questions: (a) Will subjects understand what I mean by the questions? (b) Will they know the answers without undue investigation? (c) Will they be willing to share the info truthfully? (d) Exactly what am I going to do with the answers when I get them?

Do you have a similar question?

Recalculate according to your conditions!

New Questions in Research Methodology

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?