Archive for November 2008
Naturalización como ciudadano Mexicano
El viernes por fin pasé mi examen de naturalización como ciudadano Mexicano. Estoy muy orgulloso del logro. He querido tomar la nacionalidad del país desde 2005 cuando apliqué. Desde entonces, la Secretaria de Relaciones Exteriores ha perdido mi expendiente en multiples ocasiones y todavía estoy esperando la carta que formalice mi estatus.
Por favor ¡vota! No te cuesta mas de un segundo.
Otro detalle sorprendente fueron las 101 preguntas que tenía que memorizar para asegurar éxito en el examen. La guia de estudio cambió desde que estaba invitado a tomar el examen, asi que tuve que hacerlo dos veces. La segunda guia fue bastante menos lógica que la primera.
Aqui esta la guia de estudio guia_estudio
El procedimiento es muy arbitrario. El candidato para naturalización recibe cinco preguntas aleatorias de esta lista esoterica. Se permite equivocarse en una. Las preguntas obviamente tienen poca relevancia como forma de evaluar la integración del candidato en la sociedad Mexicana. Hay algunas ambiguas y mal redactadas. Por ejemplo “Que era la cuenca larga”. Supongo que queria decir cuenta.
Si tienes otras sugerencias dejalos como comentarios abajo.
Aqui hay algunos ejemplos mas
Kolmogorov-Smirnov tests of normality
Click here for an explanation of the animation below
Today I had an interesting exchange regarding tests of normality when teaching introductory statistics.
I have a dilemma. The point is that I do not place a great deal of stress on tests of normality when I teach Master’s students, although I mention that they are often used. However in introductory statistics texts tests of normality are given a lot more attention, presumably in order to ensure that students are aware that normality is an important assumption for many statistical procedures. I’m all in favour of testing assumptions. But do students really know what assumptions they are testing?
I have to teach introductory statistics without confusing students or sending mixed messages. It is therefore quite a delicate matter that needs clarity.
In fact none of these statements are accurate (and its Smirnoff you are thinking of!). My own preference is to try teach students to understand why any underlying population or sampling frame might not be normal.They should also intuitively understand how the procedure used for sampling from the population may influence the properties of the sample drawn from the populations.
These properties are then treated as expected before beginning any field work. All data transformation or use of non-parametric tests are pre-planned as part of the formal protocol designed for data collection and analysis.
I really do not like any post-hoc alterations to a planned work scheme after the data are collected. At best they waste time, at worst they lead students to think that the data themselves are somehow “invalid” and thus unpublishable.
I therefore quite strongly dislike including post hoc tests of normality within the work flow of the analysis as a knee jerk procedure with a yes/no answer. This certainly does not suggest that I tell students to assume that all the preplanned analyses are necessarily valid, nor to accept that inference on the mean can be conducted without checking assumptions.
The alternative to automated tests of normality is to make sure that students always visualise the distribution of their data fully in order to understand why any assumption of normality may be wrong. I also try to encourage students understand how and why data transformations might work. Again this is usually most helpful before data is collected, but it is also a way to deal with major surprises.
Here again is the link to the pdf document I wrote that suggests a possible answer to the poll.
Click on the link above as it is easier to include PDFs in wordpress this way.
An here is a quick test of any interpretation of the results of a KS test of normality.
Just to summarise the well known reason to avoid testing for normality. If you draw a very large sample from a slightly non-normal population the test tends to provide low p-values. You should presumably reject the null hypothesis that the data could have come from a normal population and according to a strict interpretation you then can’t use your planned analysis as it would be “invalid!
However if you draw small samples from very non normal populations (as shown in the pdf) you will not reject the null hypothesis as often, even though the methodology will provide misleading inference.ksdemo3
Draft beginners R course
As a follow up to today’s post on Lyx I have produced two presentations for a beginners course using Lyx/Beamer/Sweave.
I developed a very simple working protocol for making Beamer presentations with sweave. Once everything is set up it is simple and productive. I use the standard beamer commands for any slide without R code. I don’t ever try to mix beamer features and R code on the same slide.
Here are two introductory presentations using this approach.
And the draft document (with gaps)
The source is in the zip file below (remove the .doc and replace with.zip)
Lyx and Sweave: Worth climbing the learning curve?
One of the interesting elements of using Linux is the demystification of many *nix concepts that you might have come into contact previously in a lateral manner. One of these is Latex. Any R user in Windows quickly becomes aware that Latex exists. However Latex is fundamentally a *nix thing. It has even been claimed by Windows users that Latex is a legacy typesetting paradigm!
There are certainly some issues with the complexity of Latex. However there is no denying that Latex documents look impressively formal. Personally I think it is worth the effort to look seriously at latex, even if you are a complete newcomer to Linux.
I was curious about Latex. However I readily admit that I really couldn’t be bothered to learn latex as such. Life is far too short. Fortunately I found Lyx. Lyx is Latex for the lazy. No need to learn all the details, it is advertised as a WYSWYM latex typesetting program (What you see is what you mean).
So how do you use Lyx? The first important tip is to make sure that you install ABSOLUTELY, BUT ABSOLUTELY, EVERYTHING remotely relevant to TEX and Latex before starting anything with Lyx. Fonts, languages, the lot. This avoids later frustrations with missing sty files etc. I can’t remember offhand all the packages I needed, but do go to Synaptics and look for texlive-full plus anything else remotely related with either tex or latex. This will mean quite a long initial download but you certainly won’t regret it. Without all the Tex/Latex stuff installed the Lyx experience can be frustrating. You are likely to get at least a few annoying and incomprehensible error messages when compiling documents. This could give you the feeling that the whole system is more trouble than it is worth. You will add about 500 MB to your install in total, but that shouldn’t be a problem.
The next step before using Lyx seriously for work is to analyse your own ability with Word or Open Office. The idea of using Lyx is to save time in the long run. Experienced Office users can probably replicate everything Lyx or Latex can do. However many of us do have problems writing long documents such as theses or technical reports in Office suites. It becomes very difficult to maintain a consistent style. I now find Lyx easier to use than any office program for long document. The logic of using Lyx is to improve productivity, not make simple tasks much more difficult. Define what was most difficult for you to use in an Office suite and find a consistent way to achieve it in Lyx. My main problems were with positioning figures on the page and producing correctly structured tables of content. Both are very easy in Lyx once you have found out how, but there is an initial learning curve.
Then follow the examples in the Lyx documentation and allow some time for experimentation.
Sweave in Lyx
Most R users will be very interested in the use of Sweave with Lyx. This has been made possible by the work of Gregor Gojanc. I have only recently realised just how cool this can be. Once all the Latex extras are installed in Ubuntu following the instructions in “INSTALL” here worked well.
http://cran.r-project.org/contrib/extra/lyx/
A faster way of getting to this point would be to download the file below which contains my .lyx directory hidden as a zip file with fake doc extension. Replace the whole of the .lyx directory you have in your home directory by mine. Remember to use control H to see hidden files. You might want to back up your original directory first, but the replacement shouldn’t cause any issues.
Then do tools/reconfigure in Lyx and you should be able to use Sweave. My directory also has a layout incorporated for making beamer files that I got from here.
http://ggorjan.blogspot.com/2008/09/using-beamer-with-lyx-sweave.html
Here is an example
The source in Lyx that made this is available here (again take of the doc extension and change it to .lyx), You’ll need beamer-latex installed to compile it and you will have to provide your own logo picture.
And here is an early draft of a course for beginners to R that I am writing in Lyx/Sweave. If the documents compile then you have everything installed correctly.
Again there is a bit a sharp curve up to get Lyx and Sweave working. You will need to read the sweave manual first (A Google for Sweave will provide other material)
It is admitedly hard to get Sweave/beamer working as this is still quite an experimental concept. I found that not all the beamer features work, but you can still get some nice looking slides. The easiest way to use beamer for the time being is to adapt my template by cutting and pasting the box of “ERT” for every new slide, changing the R code within them to what you need. Some styles such as Title and Author stop the document compiling, so don’t change the title page much. Remember to put your own logo in the graphic. The huge payoff in terms of time saved producing slides is made by \begin{frame}[shrink,containsverbatim]. The shrink option automatically adjusts R output to the page. Great as long as you don’t print a huge object. Also notice the use of xtable and <<results=tex>> in code chunks to produce formatted tables.
Inserting a histogram into a normal Lyx/Sweave document like the R course is very simple. In an ERT box you just need to write.
<<fig=T>>=
hist(x)
@
The great thing about this way of working is you know all the R code you are showing actually runs, as they document won’t compile if you’ve made a typo or syntax error.
I will be mentioning Lyx with bibtex in a later post as this is another potentially major productivity booster that may need some help to use at the start.
A presentation for the first class on the introductory R course I am writing is now available here
