Online economics
Category Archives: Statistics & data

Worst chart ever

Via Statistical Modelling:

I seriously think people should have to get a license to create charts. Original here.

by aaron. Permalink. Comments (0). Comments RSS.

Using Google Spreadsheets for surveys

Google spreadsheets has added forms, which allow web users to input data into a spreadsheet that you’ve created. This makes it perfect for running simple surveys. Here’s a quick run-down on how to do it:

First go to Google Docs and create an account if you haven’t already got one. Then make a new blank spreadsheet.

Next, share the spreadsheet by clicking the share button at the top right corner:

share.png

You’ll be prompted to name the spreadsheet, and then you get some sharing options. Choose the forms option, then click “Start editing your form”:

forms.png

Then you’re presented with a form editor which is pretty self explanatory. You can enter a title for the survey, some explanatory text, and then questions which can be short or long text input, multiple choice (with an “other” option), checkboxes, or selection from a list. You can add, edit, delete and re-order questions easily.

formedit.png

When you’re done with the questions, click “Next, choose recipients”. Then you’re presented with various options for giving the survey to people. You can email it to specific people by entering email addresses, and if you like you can embed the form in the actual email (if the recipients can handle html email). Alternatively you can click the “Embed” link at the top right to embed the form in any kind of web page. Just copy and paste the code that’s provided into your web page.

send1.png

And that’s all there is to it. Columns corresponding to the questions will be automatically created in your spreadsheet, and any data that people enter will appear in the right place, with a timestamp. One thing lacking is data validation, so you can’t check at the input stage to make sure that people have entered appropriate responses, and you can’t have mandatory questions. It also lacks conditional branching questions, so all respondents will get to answer all questions. Despite these limitations, Google Spreadsheets are a very quick and convenient way to run a simple survey or other data collection exercise.

To keep track of new data, you can periodically check the spreadsheet, or you can add this gadget to your iGoogle homepage, which will automatically inform you when there are new entries:

by aaron. Permalink. Comments (0). Comments RSS.

Less is more

Very loyal reader Chewxy informs me about some analysis by Jakob Nielsen of a study done by some others about web users’ habits.

The headline results for bloggers are: People spend 4.4 seconds per additional 100 words on a page. This means that readers will read about 18% of additional content.

This graph that Jakob made shows the maximum percentage of words that people could read on a page as a function of the number of words, taking account of the average amount of time that people spend on a page given the number of words that it has, at an average reading speed:

Conclusion: If you want to be read by the masses, keep it short.

Interesting analysis, horrible graphs

Richard Cunningham has done some interesting analysis of the types of stories that have been popular on Digg over time. The main trend seems to be that tech stories are shrinking rapidly, while things like “offbeat” stories are increasing. It’s pretty clear that Digg is becoming more mainstream, by accident or by design.

Anyway, I don’t want to be too critical, but the stacked graphs that he uses are just horrible. Check out this one which shows the number of popular Digg stories by category (click for bigger):

Stacked graphs do show you how the composition of something changes over time. But it also makes comparisons of trends across categories very difficult. To compare the lifestyle category to the offbeat category, for example, you have to somehow figure out the relative thickness of the two areas and then try to figure out how this varies over time. For mere mortals like me, this is impossible. Much better would be a line chart that allows easy comparison between categories. If you want to examine how the total composition of stories changes over time, give a table or line chart of the percentage of stories in each category. It’s very rare to come across a situation where a stacked chart is a more effective way to communicate information than a simple line or bar chart.

(HT: RWW)

Wanted: RSS feeds for data

I often have the need to pull some economic data from various sources and do a bit of analysis in Excel or whatever. With time-series data especially, it is a pain to pull the new data when a new data point is released (every quarter or whatever) and refresh the analysis. It would be very helpful if there was a standardised format like RSS for data feeds. Then, if it were supported in Excel, I could plug my spreadsheet into that feed and know that my data would be automatically updated.

It doesn’t seem very hard to do, since time series are pretty simple. At a minimum, a feed would just need to contain a description, and then a list of (date, value) pairs comprising the data series. Then all that’s needed is a protocol for publishing and fetching feeds, which could probably work exactly like RSS.

I did a bit of searching, and companies like EcoWin are selling data feeds in some kind of proprietary format, and there are obviously many providers of financial data feeds. What would be more useful is if national statistical agencies and the OECD etc supported a common format and used it to publish the data that they currently provide in less-convenient form on their websites.

by aaron. Permalink. Comments (3). Comments RSS.
© Copyright 26econ.com 2008