Sampling procedures and calculation for sample size determination : criteria and methods adopted in theses and dissertations in Human Movement Sciences-a descriptive study

Quantitative monographic studies systematically use inferential statistical procedures to test hypotheses. For this purpose, sampling procedures and sample sizes need to be adequate for the proposed procedures. The aim of this study was to identify the sample selection methods, as well as the performance and types of calculation to determine the sample size adopted in theses and dissertations developed in a graduate program in the field of Physical Education. Theses and dissertations defended between 2003 and 2013 were obtained through digital repository. Only quantitative studies were included, in which the following issues were analyzed: (1) sample selection criteria; (2) presence of sample calculation; (3) calculation type to estimate sample size. A total of 199 studies were included. Of these, 6% (n=11) used probabilistic methods for sample selection and 3% (n=6) used animal models. As for the accomplishment of sample calculations, 36% (n=72) studies reported having adopted this procedure. Of studies that performed sample calculations, 25% (n=18) used predictive equations, 67% (n=48) considered methods with statistical power as their base, 3% (n=2) used confidence interval, 4% (n=3) did not mention the method and 1% (n=1) was based on the type of statistical test to be used later. Nonprobabilistic sampling methods predominate for the selection of subjects; most studies do not report adopting calculations to estimate sample size and, among those that reported the use, the models that consider statistical power as the main criterion are predominant.


INTRODUCTION
Studies of quantitative nature, in the great area of Human Movement Sciences, present the most diverse objectives.While some aim to evaluate effects of different physical training methods for rehabilitation or performance improvement purposes (thus comparing them), others aim to identify the existence of associations between certain characteristics of interest.There are also studies that seek to verify the prevalence and incidence of certain motor patterns and studies aiming at making predictions from available variables.In this context, the correct use of the methods for the selection of research subjects, known as "sampling", as well as the precise determination of the number of individuals to be recruited is of fundamental importance in order to obtain adequate results [1][2][3][4][5][6][7][8] .
In quantitative surveys, which aim to propose generalizations from a set of data (inferences), the study subjects are generally called the "sample", which represents the case or unit element of the research.When treated collectively, the total set of subjects is designated as "population", which represents the cluster of all elements that have at least one characteristic in common [1][2][3] .In this sense, sampling techniques are necessary in view of the fact that, in almost all studies, it is not possible or convenient to access the entire population.In this way, information about a part of this population is acquired in order to infer attributes over the whole 4 .
The choice of sampling procedures should be guided by the study objectives and characteristics of the methods to be adopted.Traditionally, sampling methods can be classified as probabilistic (subjects are randomly selected, and all have the same probability of being selected) and non-probabilistic (the selection of the study subjects does not occur with equiprobability, that is, there is no equal chance of being selected among the population) 1,5 .From the statistical point of view, it seems reasonable that probabilistic samples are more adequate; however, in the practice of research in Human Movement Sciences, procedures for sample randomization are not always performed 4,3 .
In addition to the method for selection of subjects that will compose the sample, the adequate sample size to achieve acceptable accuracy of the measure and, consequently, the final result is also important 6 .The determination of the sample size is the first practical procedure in the development of an experiment to answer the research question 2 .However, it represents a complex process, influenced by many factors, such as the knowledge about the sampling process, the study design and the statistical tests 1 .
The definition of the adequate number of subjects to be included in quantitative studies is extremely important, considering statistical and ethical aspects.Undersized or oversized samples may lead to misleading conclusions and inappropriate ethical behavior 6 .In this context, Brito et al. 7 suggest that it is of fundamental importance that researchers justify the number of subjects to be included in studies with Research Ethics Committees.According to Winter et al. 8 , calculations for determining sample size aim to select the number of participants in an experiment as economically as circumstances allow.
Several calculation models have been proposed to determine the sample size of studies.Among the different possibilities, some authors suggest equations that consider as a main factor the statistical power previously chosen, while others, the confidence interval of the measure to be evaluated, obtained from previous studies.The choice of the most appropriate way to estimate the ideal sample size should be based on the study design.This is a growing and technological evolution area 7 , which has facilitated the practical application of knowledge, allowing obtaining more precise estimates as to the ideal number of subjects to be included in studies in the area of Human Movement Sciences.Therefore, it was identified the need to carry out a survey to know, in an in-depth way, which are the most widely used procedures, in order to enable the improvement of such methods.
Thus, the aim of the present study was to identify the sample selection methods, as well as the performance and types of sample size calculation methods adopted in doctoral theses and Master's dissertations developed in a graduate program in the great area of Physical Education.

Study Design
This work is characterized as a descriptive study.

Selection of Theses and Dissertations
Theses and dissertations used in the present study come from the Graduate Program in Human Movement Sciences (PPGCMH) of the Federal University of Rio Grande do Sul (UFRGS) and were obtained through a digital repository, being thus of public access, not requiring evaluation by the Research Ethics Committee.Monographic master and doctoral studies presented between January 2003 and December 2013 were included, considering that this study is part of an approved project with collections for this period.These theses and dissertations were first cataloged and inserted into the EndNote® reference manager software.
As inclusion criteria, only quantitative studies were included (in the case of this study, those that determined a formal, objective and systematic method for the generation of numerical data that were used to establish relationships between variables adopting statistical methods already standardized in the scientific literature) defended in a qualifying examination board during the aforementioned period.Mixed studies were also included, with quantitative and qualitative approaches simultaneously.On the other hand, studies of purely qualitative nature, studies of validation of research instruments, in addition to those unavailable in the digital repository were excluded.

Data Collection and Analysis
Data were collected using a standardized form, the same for theses and dissertations, by two researchers who were experienced in this type of collection, independently.The researchers involved in data extraction were a post-doctorate student and a post-doctorate researcher, with prior knowledge in biostatistics.The form consisted of the following questions with the respective answer possibilities: (1) criterion for sample selection ([a] non-probabilistic, intentional, for convenience; [b] probabilistic, random, casual; presence of calculations to determine the sample size ([a] no; [b] yes); type of calculation to estimate sample size ([a] not applicable, [b] determined by predictive equations, [c] determined using statistical power; [d] considering the type of statistical test to be adopted a posteriori in the analysis of dependent variables, [e] considering the confidence interval of the dependent variables).With regard to the criterion "determined using statistical power" for the type of sample calculation, all studies that reported having carried out their estimates presenting the value for the statistical power chosen and using parameters related to the dependent variables obtained in previous studies published in the scientific literature, which were not included in the other categories, were included in this group.
After individual collection was completed, the agreement of data extracted between researchers was tested.When some disagreement was identified, it was decided by consensus, through an analysis of a third researcher, university professor, with a doctorate degree.These data were tabulated in digital spreadsheets of Microsoft Excel® 2011 software version 14.7.0.

Statistical analysis
The characteristics of studies are presented through descriptive statistics, with values described from absolute and relative frequencies (in relation to the total of quantitative studies included).These analyses were performed in the statistical package SPSS® version 22.0.

RESULTS
A total of 327 studies were identified in the digital repository from January 2003 to December 2013.Of these, 199 were used in the final analysis of the present study.As general characteristics of these studies it is possible to emphasize that 134 studies have cross-sectional design (67%), while 65 are longitudinal (33%).Figure 1 illustrates the characteristics of the identified studies.
The analysis of the method used to select the sample of studies showed that, of the 199 studies included, only 11 selected their samples in a probabilistic way (6% of the total), while six used animal models (3%), and this classification is not applicable.Among the 43 theses of quantitative nature, only two (5%) adopted probabilistic methods to select their samples and another three were performed with animal models (7%).Similarly, of the 156 quantitative dissertations presented in the analyzed period, only nine selected their samples in a probabilistic way (6%) and another three (2%) adopted animal models.
Among the 11 monographic works analyzed that adopted probabilistic sampling techniques, five were composed of athletes of regional or national level, which enabled obtaining the listing of this population with the respective confederations.Four other studies adopted probabilistic samples selected within population conglomerates.Finally, two other studies adopted the simple draw from the list of subjects in databases to compose their samples in a casual way.These characteristics made possible the random composition of the sample selected in these studies.
Regarding the accomplishment of calculations to determine the sample size, 72 studies (36%) reported having performed calculations for this purpose.In this context, 15 theses (35%) reported adopting calculations to determine the sample size.In addition, 57 dissertations reported having performed such procedure to establish the number of subjects needed for their studies (37%).
Stratifying the 72 monographic studies that reported having performed calculations to determine the sample size, a descriptive analysis was performed regarding the method adopted for the establishment of this number.This analysis demonstrated that 18 studies (25%) determined their ideal sample sizes using predictive equations, another 48 studies (67%) adopted determination methods through statistical power obtained from previous studies.One study (1%) reported having adopted procedures that considered the type of statistical test to be used a posteriori, two other studies (3%) reported having performed procedures that adopted the confidence interval as an instrument for determining sample size, and finally, three studies (4%) did not mention the method chosen to calculate the sample size.
Among the 15 theses that performed this procedure, four determined sample size using predictive equations (26%), nine others used determination methods through statistical power found in previous studies (60%).One thesis reported having performed procedures that adopt the confidence interval as an instrument for determining sample size (7%), and finally one thesis did not mention the method used to establish the sample size (7%).
Of the 57 dissertations reporting calculations for sample determination, 14 determined their ideal sample sizes using predictive equations (24%), 39 used determination methods using statistical power found in previous studies (68%).One dissertation reported having adopted procedures that consider the type of statistical test to be used a posteriori (2%), another study mentioned having performed procedures that adopt the confidence interval as an instrument for determining the sample size (2%) and, finally, two dissertations did not mention the method used to establish the sample size (4%).
Among the 14 studies that adopted predictive equations to determine the ideal sample size, Equation 1 was the most frequent, being in five studies (36%).Equation 2 was used in three other studies (21%).Equation 3was used in two studies (14%).Equations 4, 5, 6 and 7 were adopted in a single study each (totaling 29%).

DISCUSSION
This study aimed to identify the sample selection methods, as well as the performance and types of calculation to determine the sample size adopted in theses and dissertations developed in a PPG (Graduate Program) in the area of Physical Education.In this sense, it was verified that most studies (91%) adopted the non-probabilistic method to select the participants of their experiments.In addition, it was possible to observe that this predominance occurred regardless of type of monographic work analyzed, showing similarity in this predominance in theses and dissertations.According to Vieira 3 , the definition of the sample components of a study requires the establishment of rigid criteria to be used to select the units that will compose the sample.However, non-probabilistic or convenience samples do not invalidate the research, since they are very well described, they represent only the population of individuals similar to those included in the sample.Therefore, this type of sample must be obtained from some type of criterion, since not all elements of the population have the same chance of being selected.This requires caution in the generalization of the results obtained.Given its limitations, this type of sample may be appropriate when the research target population is composed of specific groups (e.g., patients or high-level athletes) or when the study budget is limited 3 .Such conditions were widely observed in the monographic studies analyzed in the present study and, therefore, may justify the widespread adoption of non-probabilistic methods for the selection of research subjects in the selected studies.
On the other hand, few studies (6%) adopted probabilistic techniques to select their sample units, behavior observed in both theses and dissertations.Although probabilistic samples are statistically preferred, in practice they are not always feasible.A possible explanation for the small number of works found may be that the researcher must have access to the list of all subjects of the population of interest, so that, from this, the units that will compose the study are drawn.In this way, it is necessary to know the population and each unit should be identified by name, number or code 3 .Such requirement makes it impossible to select random samples in most of the studies with interventions in humans in general.
Regarding the sample size, a little more than one-third (36%) of studies reported using methods to estimate the ideal number of participants.This can be considered a small number, which is repeated when analyzing separately theses and dissertations, considering the importance of adopting appropriate methods for such determination.The appropriate specification of the sample size confers internal validity to the study 7 , since the "quality" and the accuracy of the estimate are strongly dependent on the sample size 6,3 .
According to Gaya 1 , the choice of the method for sample size design depends on the knowledge about the process of sample constitution (randomness), the type of experimental design adopted, the expected statistical analysis, and the knowledge of the effect investigated.A commonly observed problem is that the tradition followed in the research area has been, over the years, used as one of the factors that influence the sample size.When not well defined, based on the theoretical principles mentioned above, it implies a possible biased choice of the sample size.All these factors demonstrate the complexity of the process of determining the correct sample size and possibly contribute to the high number of studies (64%) that do not adopt sample size calculations.
It is important to emphasize that the use of an inadequate number of subjects may lead to misunderstandings in the study conclusions.Results from samples smaller than necessary may lead to the acceptance of the null hypothesis when this is in fact false (type II error).Excessive sample size may reveal a waste of financial and human resources in data collection, and imply in inadequate ethical conduct, since more subjects are exposed to experimental procedures than would be necessary.Moreover, the use of sample size larger than adequate may increase the probability of finding statistically significant differences even if such results have no biological or clinical relevance 2,6 .
An important point to consider in terms of sample size in quantitative studies is the study design.In descriptive studies, sample size is mainly influenced by the variation of the phenomenon investigated in the population and the accuracy of the estimate that one wants to obtain.In this sense, the larger the population variation, the larger the sample size required.Similarly, the greater the accuracy required, the greater the number of participants in the sample.In experimental studies, one of the main determinants of sample size is the planning model chosen.In studies that attempt to evaluate the variation of a certain intra-subject variable, the required sample will be smaller than that required for inter-subject studies, considering that the first group of experiments has greater internal consistency in results when compared to the others 1 .
Regarding the method chosen to determine the sample "n", the analysis showed that approximately two-thirds of studies (68%) that had adopted some criterion for this purpose did so by determining a statistical power value as the primary criterion, and considered parameters related to dependent variables obtained from previous studies.By convention, in biostatistics, and especially in the area of Physical Education, the maximum acceptable value for type II error (probability of failing to find differences when they actually exist -represented by "β"), was set to 0.20, e.g., it is assumed a 20% chance that the null hypothesis is wrongly accepted.In this context, the statistical power of the study (1 -β) results in 80%, which means accepting that real differences between averages will be lost in one of five comparisons performed 5,2 .The management of different values of statistical power inserted in the sample calculation is one of the most important procedures in the practice by researchers and consists of a conduct of total responsibility of the researcher.Probably, due to the possibility of testing different values in order to obtain sample sizes feasible with the reality of the experiment, the choice by methods that adopt statistical power as the primary criterion is the most frequent.However, the importance of ethical conduct in the choice of the power value to be adopted in the calculation is emphasized, considering that the chances of committing type II error reduce as the study power increases.
The use of predictive equations was the second most widely used method to estimate the ideal sample size (25%).This method is a practical technique; however, the choice of the most appropriate equation for the type of study should be carried out with caution.In some of these equations, the insertion of the population size of interest is necessary, which may hinder its application, since knowledge of the population "N" is not always easy to obtain.The most common use, therefore, is that of predictive equations, which are based on the variability of the dependent variable of interest.
The consideration of the confidence interval as an instrument to determine the sample size, as well as procedures that consider the type of statistical test to be used a posteriori also seem to be unusual methods in the monographic works presented to PPG, each of these strategies being used in only 2.34% of studies.A possible limitation of the use of the confidence interval as a parameter to perform the sample size calculation is the fact that there are still few studies published in the area of Physical Education that present this parameter, making it impossible to obtain data on several variables of interest.This fact may justify the low percentage of choice of this method.
In many cases, the non-use of methods that incorporate the type of test to be used a posteriori is justified by the difficulty of identifying, in the initial phase of the study (as a research project), the statistical model to be adopted.In fact, according to Greenland et al. 9 , the choice of the appropriate statistical model requires knowledge of several assumptions, such as the study design and how data collection and analysis were conducted, so that this information is incorporated into the model that supports the method.This demonstrates the difficulty of a correct choice a priori, since, several of these assumptions are unknown or uncertain in the phase in which the determination of the number of subjects to be included in the study is performed.
In general, although the PPG of which the studies analyzed in this study are originated has good results in an official evaluation body, with more than 30 years of activities without interruption, the initial monographic works (dissertations and theses) have, in the majority of cases, important methodological limitations, such as those presented in the results of this study.This fact suggests that, possibly, not the quality of PPG studies should be evaluated in order to obtain good concepts in relation to financing institutions, but rather, the number of studies carried out, thus demonstrating a predominantly quantitative analysis.Another important point is that this evaluation is focused on the final product, that is, on published articles, not the process (dissertation or thesis).
The limitations found in the monographic works analyzed also suggest deficiencies in education focused on the principles of research methodology and biostatistics of graduate students in human movement sciences.Thus, the insufficient preparation of students on the importance of the correct use of the sampling techniques and of the main calculation methods to determine the sample size is emphasized, as well as instrumentalization for their performance.For this problem, it is believed that there is no short-term solution, since such a resolution involves a continuous process of teaching/ learning of graduate students.However, in the medium and long term, it is possible to suggest some improvement strategies, such as: (i) to provide better familiarization of graduate students to the sampling methods in theory through studies that allow the theoretical basis; experimentation of sampling methods in practice through real or simulated participation in research that enables practice.(ii) Insertion of classes demonstrating the importance of a priori sample size estimate using simulations and parameters such as power, effect size and significance level; and, finally, (iii) practical instrumentalization in the various methods of determining the ideal sample size (predictive equations and software).In addition, the important role of qualifying examination boards in the construction of projects more qualified from the statistical point of view is also emphasized.
However, some limitations may be attributed to the present study: (i) the analysis of monographic studies from only one PPG.(ii) The limitation of inclusion of studies defended over only 10 years.(iii) The lack of analysis of studies that do not exist in the digital repository.(iv) The analysis based exclusively on information presented in public documents (theses and dissertations available in the repository), with no contact with researchers for the acquisition of additional information.Thus, for the advancement of this area of knowledge that is in full expansion, it is suggested to carry out new studies mapping the methodological procedures regarding the sampling and statistical processes adopted in order to identify weaknesses and suggest methods for improvement.
In this sense, the present study sought to contribute to the growth of science in this specific area of knowledge, demonstrating the need for greater attention to teaching and instrumentalization regarding methods related to sampling processes and sample size determination in graduate programs.In addition, this study sought to warn researchers to the need for further deepening and planning of quantitative data analysis.To our knowledge, this study is unique in its objective, presenting unpublished data regarding the sampling processes and performance of sample calculations in a PPG in the area of the human movement sciences.

CONCLUSION
The findings of the present study allow us concluding that the monographic works presented to the PPGCMH predominantly adopt non-probabilistic sampling methods for the selection of subjects that compose their samples.Moreover, the majority of studies do not report adopting calculations for the estimation of the ideal sample size for their experiments.Finally, among those who use calculations to determine the number of subjects to be included, the models that consider statistical power as the main criterion are predominant.

Figure 1 .
Figure 1.Quantitative flowchart, representative of studies identified, excluded and included in the analysis, including the reasons for exclusion, as well as the stratification level (thesis or dissertation) of included studies.Porto Alegre, November 29, 2017.