You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ChristianKl comments on Open thread, Sept. 1-7, 2014 - Less Wrong Discussion

4 Post author: polymathwannabe 01 September 2014 12:18PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (162)

You are viewing a single comment's thread. Show more comments above.

Comment author: ChristianKl 06 September 2014 10:38:45AM 4 points [-]

Out there in the world a lot of people use software like Excel for doing their data processing. They want to have tables where they see their data.

That has the advantage that you have a nice GUI that normal people can easily learn. However some tasks take a lot of time with tables, and Excel automatically reformats your data when it think it knows better than you. Excel also doesn't handle it well to have 500000 rows in your data. Excel doesn't make pretty customizable plots.

Often the choice is between doing a task for 15 minutes in manual labor in Excel or writing 5 lines in R that take you 15 minutes of reading the documentation to find the right parameters.

As a result in a lot of professional context where statistics are needed people use specialised statistics software. That might be SPSS, Stata, SAS or R. SPSS, Stata and SAS both need a license and R is free software. State of the art statistics if often done in R and if someone invents a new statistical method they often publish a R package along with their paper to allow other people to use their shiny new technique.

It's worth noting that statisticians aren't primarily programmers and R is build for statisticians. It has a lot of powerful magic functions with 20 optional parameters.

These days there are also liberaries for like Pandas for Python that allow you to do most of the things that R can do while at the same time having a beautiful language.