Here's the abstract and the data file of my thesis. You can also download the whole text (PDF, in Czech only).

The byproduct of this thesis is a random sample of a few hundred thousand YouTube videos which you can download and use for your own research.

This thesis deals with amateur content on YouTube ( through the lens of Bordwell and Thompson's film theory. Eight distinct categories of audiovisual style are identified within a sample of 50 videos. These categories are defined and described using eight different case studies. Consequently, the exact proportion of these categories is ascertained by means of a quantitative study of 385 YouTube videos. The study shows that the most frequent forms of expression on YouTube are so-called slideshows and video snapshots. Other categories of audiovisual style that were found among user generated YouTube videos include: videoblog, home video snippet, screencast, document, short film and editing remix. Only a small proportion of YouTube videos does not fit in the eight categories. The thesis also includes a treatise about the media significance of YouTube, a summary of available texts about YouTube, and an enumeration of factors influencing audiovisual style of YouTube videos. The thesis further points out the shortcomings of present YouTube research texts and proposes a method that should ensure the most representative sample of YouTube videos possible.

Data file
The data file is in SPSS .sav format. You can download it here.

Here are some of the interesting findings of this paper
  • 79 % of YouTube videos are user-generated. The rest is either professional or ripped content.

  • 25 % of user videos on YouTube are actually slideshows of static photographs

  • 18 % of YouTube videos are "video-snapshots" - short, quick and dirty recordings of everyday stuff

  • the average duration of a YouTube video is 4:46

  • if the claim that 20 hours of video are uploaded every minute is true, then there are 365,512 videos uploaded every day

Please note: N=385, confidence interval ±5%, confidence level 95%. Also, the sample might not be completely unbiased – see Sampling YouTube.
