Regression to the mean (or at the end, people are not as smart as you could expect)

Francis Galton very cleverly coined the term "regression to (or towards) the mean" meaning that if a variable is shown extreme in a first measurement, then the following observed values of that very variable will tend to get closer to the average of its distribution. The classical example is height: a tall child will have … Sigue leyendo Regression to the mean (or at the end, people are not as smart as you could expect)

Data Literacy

Once again I have decided to pimp my blog. This time is a significant change: the name. This blog began a long time ago. It was 2006; I was 22-year-old, and I was enrolled in a Master of Science in Statistics. I had plenty of doubts about statistics, data science, and uncertainty (fortunately, some of … Sigue leyendo Data Literacy

My talk in Bogotá – Pvalues: use and abuse

Did you know that American Statistical Association (ASA) made a disclaimer about the proper use of p-values? Moreover, ASA (the oldest scientific association in the USA) claimed that: P-values can indicate how incompatible the data are with a specified statistical model. P-values do not measure the probability that the studied hypothesis is true, or the … Sigue leyendo My talk in Bogotá – Pvalues: use and abuse

Isolating confounding effects – What does it mean?

Have you heard (or read) to those geeks (econometricians and statisticians) talking about controlling for some variable in a study? The origin of this practice lies on the design of experiments: when you plan an experimental study, you try to randomise units according to categories that you know that are important.For example, let’s assume that … Sigue leyendo Isolating confounding effects – What does it mean?

Not a good advice (On regression over principal axis)

Some time ago, someone told me about a robust methodology that was used in order to select the significant variables on a factor. Long story short: he was performing some factor analysis and he wanted to know which variables were more important than others on that very factor. The good advice was: define a model between … Sigue leyendo Not a good advice (On regression over principal axis)

La paradoja de Lord

En un artículo llamado A Paradox in the Interpretation of Group Comparisons publicado en Psychological Bulletin, Lord (1967) hizo famosa la siguiente historia controversial:Una universidad está está interesada en investigar los efectos de la dieta nutricional que sus estudiantes consumen en el restaurante del campus. Se recolectaron varios tipos de datos incluyendo el peso de cada … Sigue leyendo La paradoja de Lord

Estadística aplicada en la evaluación de la educación – Call for papers

La Revista Comunicaciones en Estadística ha alcanzado en muy poco tiempo una connotación importante en el ámbito local. Como prueba de ello, la revista fue aceptada en el Current Index to Statistics y además Colciencias decidió clasificarla en la categoría B en Publindex. Este es un gran logro para su consejo editorial, en cabeza de … Sigue leyendo Estadística aplicada en la evaluación de la educación – Call for papers

Anchoring estimation or the perfect excuse to become "Bayesian"

Anchoring is an usual process when estimating abilities in test equating. This is about analyzing standardized tests, while maintaining a predefined scale. For example, assume that you have a set of 60 items in your test. However, two test forms (named Form A and Form B) are given to the students in two different times. … Sigue leyendo Anchoring estimation or the perfect excuse to become "Bayesian"

Parametric bootstrap

Assume we want to know the mean square error (MSE) of the sample median as a estimator of a population mean under normality. As you know, this is not a trivial problem. We may take advantage of the Bootstrap method and solve it by means of simulation. This way, for $b=1,\ldots, B$, we generate $X_{b1},\ldots, X_{bn} \sim … Sigue leyendo Parametric bootstrap

La ventaja – Modelos politómicos en IRT

El viernes en la tarde, en medio de un seminario, pregunté lo siguiente:¿Cuáles son las ventajas de los modelos politómicos en IRT frente a técnicas más simples como componentes principales categóricos?Y es que en términos de construcción de índices, los analistas de las encuestas sociales tienden a escoger técnicas de análisis multivariantes para la construcción … Sigue leyendo La ventaja – Modelos politómicos en IRT

Proyecto GitHub – Estrategias de muestreo: diseño de encuestas y estimación de parámetros

Hace seis años nacía mi primer libro: Estrategias de muestreo (EM). Este libro abarca los contenidos más importantes del muestreo y la inferencia basada en encuestas y utiliza el software R. En los últimos años, EM ha sido utilizado por profesores y consultores a lo largo de Colombia y Latinoamérica. El éxito del libro ha implicado … Sigue leyendo Proyecto GitHub – Estrategias de muestreo: diseño de encuestas y estimación de parámetros

Writing Books with R and knitr

I am writing this post on behalf people who, like me, do not find any valuable stuff in the web, when trying to compile big documents in R with the knitr library. knitr is a basic tool for the statistician. It combines LaTeX and R in an single environment. It is useful to create elegant reports, … Sigue leyendo Writing Books with R and knitr