Keywords: multivariate, research, dissertation, thesis, factor loading analysis, orthogonal factor rotation
analysis, exploratory factor analysis, confirmatory factor analysis, structured equation modeling
By: Sourabh Kishore, Chief Consulting Officer
Mobile Friendly Page
Thesis and Dissertation projects with Multivariate Statistical
Modelling and Analysis
A research problem may be univariate, bivariate, or multivariate. A univariate problem is concerned with only one research variable and a bivariate
problem is concerned with linearity of relationship between two research variables. Normally, univariate problems comprise of study of multiple and
independent research variables without bothering about their quantitative mutual relationships. For example, a single research may incorporate study of
attitude, organizational commitment, and employee performance separately in a fast food chain without bothering about their quantitative relationships.
Some researchers may design triangulation studies by collecting numerical data about the three variables but establishing their interrelationships
On the other hand, bivariate research problems incorporate study of relationships between two variables by establishing a null and an alternate
hypothesis. Most bivariate research problems are concerned with mutual relationships between two variables investigated through multiple independent
hypotheses. However, the hypotheses may not be interrelated in the form of a structure or theoretical framework. The hypotheses may be tested using
bivariate techniques, like correlation analysis, regression analysis, analysis of variance, students’ t-test, Chi-square test, or simply the p-value testing. The
outcomes may be definitive causal relationships (influence of an independent variable on a dependent variable) or simply a reflection of how a parameter
varies with respect to another within a controlled research setting. Normally, establishing a relationship between two variables does not guarantee that a
causal relationship is found. Cause-effect relationships can be established by taking support from established theories or by investigating more variables in
action influencing the two variables. This is where multivariate problems come in the picture.
Multivariate problems are different and complex, requiring sophisticated techniques for investigating relationships among multiple variables. Most of the
multivariate problems require investigation of complex structures than mere relationships. Hence, applying statistics in multivariate problems is not only
about statistical calculations albeit involves complex statistical modeling. A model may be in the form of a theoretical framework or an initial measurement
model. Before the multivariate techniques are discussed, it is important to differentiate between a theoretical framework and an initial measurement model.
A theoretical framework is formed by conducting intensive literature review and creating a structure having relationships grounded on theories. On the
other hand, an initial measurement model can be established using the principal component analysis technique employing orthogonal factor rotation.
Technically, the models created following both the approaches are considered as an initial model and is taken through the same reliability, validity, and
model fitment tests. However, the research studies involving theory-based formation of the initial model (commonly referred to as the theoretical
framework) are confirmatory or extended studies whereas the research studies involving principal factor analysis technique are exploratory studies. In
practice, a theory-based modeling approach should be chosen if the model can be grounded on an extensive and deep theoretical foundation, whereas the
principal component analysis technique should be chosen if the model is not sufficiently supported by theories.
Multivariate problems have two flavours – relationships among multiple observable (measurable) variables or relationships between single or multiple
groups of observable variables and a group latent (unobservable, or immeasurable) variables. The latter is used in highly complex research studies.
The sequence of techniques used in multivariate statistical modeling are – exploratory factor analysis, confirmatory factor analysis, and structured
equation modeling. The exploratory factor analysis technique may be skipped if theory-based initial modeling has been preferred. In the exploratory factor
analysis, the number of latent (unobserved) variables influenced by a set of observed variables is explored by obtaining an orthogonal factor rotated
solution using VARIMAX, QURTIMAX, EQUAMAX, PROMAX, and DIRECT OBLIMIN rotationmethods. The most used orthogonal factor rotation method
is VARIMAX. The number of latent variables is determined by the number of rotated variables having an Eigen-value above unity. The researcher may
predetermine the number of latent variables or simply proceed to investigate the variables having Eigen-values more than unity. It is imperative to keep the
number of latent variables lesser than the number of variables having Eigen-values more than unity. This analysis is done on a Scree plot.
Dear Visitor: Please visit the page detailing SUBJECT AREAS OF SPECIALIZATION pertaining to our services to view the broader perspective of our offerings
for Dissertations and Thesis Projects. Please also visit the page having TOPICS DELIVERED by us. With Sincere Regards, Prof. N. K. Prasad. Apologies for
interruption; please continue reading.
The rotated factor table obtained after rotation is of prime importance. It gives the level of loading by each observed variable on each latent variable.
Normally, variables with significant loadings are selected and the rest rejected. The significance of loadings is determined by the loading value (should be
normally at 5.0 or greater) or the importance of the observed variable in the reliability test. The researcher may like to name each latent variable by
analyzing the group of observed variables loading them, or by taking help of literatures. Each group forms a scale representing the corresponding latent
variables. The researcher may like to test the reliability of each scale using Cronbach Alpha, Split Half, Guttman, Parallel, or Strict Parallel techniques. In
Cronbach Alpha test, an alpha value of 6 or greater is considered as a good reliability indicator for a scale if the research involves responses from human
subjects (example, phenomenology and grounded theory studies). However, researchers prefer to choose a higher alpha value in scientific and
technology-based research studies in which, the primary data is collected from experiments or simulations. It is normally observed that an observed
variable having a high loading on the latent variable is a good contributor to the Cronbach Alpha value. However, sometimes an observed variable with
low levels of loading (below 5.0) may appear to be a better contributor to the Cronbach Alpha value. The contribution of observed variables to the Cronbach
Alpha value of the scale can be determined from a table called "scale if item deleted". In some research studies, the researcher may decide to conclude the
research if very high reliability values of the scales are achieved. However, it is not guaranteed that these scales comprising groups of highest loading
observed variables are the causal factors influencing the latent variables. It is recommended that a few validity tests are also conducted. This is where the
confirmatory factor analysis technique is useful.
The confirmatory factor analysis technique helps in running validity tests on the model determined either through theory-based approach or through
exploratory factor analysis technique. It involves computation of Average Variance Extracted (AVE), Cronbach Alpha, Degrees of Freedom, Root Mean
Square Error of Approximation (RMSEA), Root Mean Square Residual (RMR), and Standardized Root Mean Square Residual (SRMR) values. There are
thresholds recommended by various research scholars based on the research area, and sample size for determining validity of the model. One should be
careful about deciding the thresholds before validating the model. If the objective is to simply validate the initial model, the researcher may conclude the
research at this stage. However, there can be situations when the initial model returns unreliable scales and invalid relationships. This is unlikely if the
initial model has been constructed with utmost care. But the researcher should be ready to face surprises and should not panic because the Structural
Equation Modeling technique will come for rescuing the research from a probable failure.
Structural Equation Modeling helps in finding an alternate model having acceptable reliability and validity scores if the initial model has failed due to
some unavoidable and irreparable issues. The technique allows the researcher to test multiple models by varying the relationships among variables and
finally choose the best fit model. The test statistics that help in choosing the best fit model are goodness of fitment, adjusted goodness of fitment, normed
fitment index, non-normed fitment index, comparative fitment index, parsimony fitment index, and incremental fitment index. It should be noted that all of
these are not suitable for every research. The researcher should choose the most appropriate ones depending upon the area of research and the sample size.
It is recommended to study a number of literatures for choosing the most appropriate fitment indices in structural equation modeling.
The recommended tool for applying exploratory factor analysis technique is SPSS, and the tool recommended for confirmatory factor analysis and
structural equation modeling is LISREL. If you need any help in designing a research, collecting data, applying techniques for data analysis, and deriving
meaningful conclusions and recommendations in a multivariate research involving exploratory factor analysis, confirmatory factor analysis, and
structural equation modeling, you may please contact us at email@example.com and firstname.lastname@example.org. We recommend using
Survey Monkey for collecting data and latest academic versions of SPSS and LISREL for applying thes techniques. The academic version of LISREL cannot
be used if the number of variables is greater than 15. However, in most cases the number of variables can be reduced to 15 or lesser if Principal Component
Analysis technique has been used and reliable scales constructed by testing their Cronbach Alpha values. This is another advantage of starting the
research with exploratory factor analysis rather than theory-based structural framework. In some research studies, it may not be possible to keep the
number of variables below 15. In such cases, it is recommended that a professional copy of LISREL is purchased. Ideally, the number of variables should be
kept as low as possible especially if the sample size is smaller (say, less than 100). Higher the number of variables, greater is the difficulty in determining
the best fit model employing Structural Equation Modeling. It is observed that most of the modern causal research problems require application of
multivariate techniques and hence, it is recommended to master SPSS and LISREL in this context. We can support multivariate research studies in all the
research areas mentioned on the page detailing our SUBJECT AREAS OF SPECIALIZATION. The choice of factors and latent variables may be chosen as
per a problem description. Typically, latent variables are the ones that cannot be measured directly. Examples are: human attitude, human feelings,
commitment to the organisation, willingness to work in a particular field, and behavioural aspects in groups or teams. However, the variables lacking data
availability because of lack of systems and processes can also be chosen as latent variables. The factors influencing the chosen latent variables under study
may be chosen from past research studies, journal articles, professional studies, industrial reports, press releases, and expert advises. The structure of the
theoretical framework may be designed by applying the exploratory factor analysis technique, or by designing based on literature reviews providing
adequate information on structural models involving the factors (observed variables) and the latent variables under study.
Some of the examples of multivariate problems are the following:
(a) Influence of organisational citizenship behaviour, organisational commitment, behavioural aspects with peers and superiors, and willingness to
participate on effectiveness of information security governance in an organisation
(b) Influence of organisational citizenship behaviour, organisational commitment, behavioural aspects with peers and superiors, and willingness to
participate on project performance
(c) Influence of multiple personality types on effectivness of crisis management decision-making and change management
In the above examples, the influencing variables are unobservable and hence need to be considered as latent variables. In order to measure them, the factors
affecting them need to be taken from literatures. The models will comprise of a relationship of the form:
Factor groups ---> Latent variables ---> Output variables
The factor groups representing each latent variable are the scales with high reliability (Cronbach Alpha value of 6 or more). The scales can obtained from
exploratory factor analysis or literature-supported groups. The rest of the analysis can be completed through confirmatory factor analysis and structural
Copyright 2011 ETCO INDIA. All Rights Reserved
Electronic Publishing, and Knowledge & Mentoring Services: through
online collaboration, cooperation, and communications