Colombian Statistical Symposium has been held this August in Sincelejo. I will be presenting this talk about sample size and population variances. The discussion rest on a recent paper we wrote for Revista Comunicaciones en Estadística.Besides, we not only approached the theory but also wrote some R functions to compute proper sample sizes in various … Sigue leyendo My talk in Sincelejo – Sample size for estimating population variances
Voting intention and calibration estimators – My article in CJS
During the last few years, I've been very interested in electoral studies. If you have been a reader of this blog, maybe you could remind that I predicted, some years ago, that Santos was going to win the presidential elections in Colombia. From that very election (Zuluaga won in the first round, while Santos won … Sigue leyendo Voting intention and calibration estimators – My article in CJS
Isolating confounding effects – Rankings and residuals
In a previous entry, we talked about the meaning and importance of isolating confounding variables. This entry is dedicated to the residuals and its relation to the variable of interest when controlling for some confounding factors.Let's think about education. This example is always a good illustration to understand this issue. Assume that the performance of … Sigue leyendo Isolating confounding effects – Rankings and residuals
Isolating confounding effects – What does it mean?
Have you heard (or read) to those geeks (econometricians and statisticians) talking about controlling for some variable in a study? The origin of this practice lies on the design of experiments: when you plan an experimental study, you try to randomise units according to categories that you know that are important.For example, let’s assume that … Sigue leyendo Isolating confounding effects – What does it mean?
Not a good advice (On regression over principal axis)
Some time ago, someone told me about a robust methodology that was used in order to select the significant variables on a factor. Long story short: he was performing some factor analysis and he wanted to know which variables were more important than others on that very factor. The good advice was: define a model between … Sigue leyendo Not a good advice (On regression over principal axis)
I don’t care about that lost unit
Just assume that you have planned a survey along with the necessary sample size to obtain representativity. Let’s suppose the sample size is 100. However, as nonresponse is always present, unfortunately your effective sample size is 99. Consider the following figure. It shows two scatterplots, the one on the right (expected) has one more point that … Sigue leyendo I don’t care about that lost unit
IRT classic anchoring with R functions
The main goal of standardised tests is to produce scores that can be compared not only within subgroups of students (and subpopulations of interest) but between applications (in different times). In summary, researchers and methodologists must assure that all of the scores induced by the test are in the same scale in order to allow … Sigue leyendo IRT classic anchoring with R functions
IRT equating using R functions – The calibrated pool method
In the assessment of education it is very common to use Item Response Theory in order to produce measures of ability for the students that applied an standardised test. Moreover, if you want to gain comparability between applications you should know that it is not enough to use IRT models but you have to do … Sigue leyendo IRT equating using R functions – The calibrated pool method
Estimating the change of two measures by equating
Suppose that we have a survey (baseline + follow-up) in which we ask to some people about a variable of interest. If we follow that cohort, or even some if we ask later to some other people, we can estimate the net change of that very variable in the scale of the baseline by means … Sigue leyendo Estimating the change of two measures by equating
Estratificación implicita usando muestreo sistemático
Una de las razones por las cuales el muestreo sistemático es utilizado en las primeras etapas de un diseño muestral es por su facilidad de implementación. Además, si el marco de muestreo cuenta con información auxiliar categórica (o continua que pueda ser categorizada) es posible ordenar el marco de acuerdo a estas variables. Teniendo en … Sigue leyendo Estratificación implicita usando muestreo sistemático
Simulando la paradoja de Lord en R
Es difícil no mencionar la paradoja de Lord en un curso de métodos estadísticos o de modelación estocástica. Además, si se utiliza el software estadístico más importante del mundo, R, entonces esta entrada es de su interés. La paradoja de Lord resume el análisis de dos estadísticos que analizan el peso promedio de algunos estudiantes en … Sigue leyendo Simulando la paradoja de Lord en R
La paradoja de Lord
En un artículo llamado A Paradox in the Interpretation of Group Comparisons publicado en Psychological Bulletin, Lord (1967) hizo famosa la siguiente historia controversial:Una universidad está está interesada en investigar los efectos de la dieta nutricional que sus estudiantes consumen en el restaurante del campus. Se recolectaron varios tipos de datos incluyendo el peso de cada … Sigue leyendo La paradoja de Lord