Some Stats and Flows

Today was grinding out some descriptive statistics.

My sample = 21826 people

Creators = 5802 (26.58%) 

Patrons = 16024 (73.42%)

I scraped data using a snowballing technique, starting with the most popular creators. weren't able to provide me with statistics about their user base so it is difficult to determine how representative this sample is. I compared the number of backers each creator had in reality vs in my network using Pearson's correlation coefficient which suggests a strong relationship (r=0.745, p=<0.05).

Some people include their location details on their patron profile pages with varying degrees of specificity (zip code, city, state, country and continent). I classified individuals who included this information by country (99 altogether), and you can see the general picture below. 

Screen Shot 2015-07-17 at 16.20.45.png

As you can see my sample is dominated by the US, but I love there are people in Antarctica, Saipan and the US Virgin Islands getting involved in global patronage networks. Unfortunately 13,375 people don't include location details (28% of creators and 73% of patrons). 

Creators classifies creators into 14 genres: animation, comedy, comics, crafts and diy, dance and theatre, drawing and painting, education, games, music, photography, podcasts, science, video and film, websites and writing. Unfortunately, this information isn't scrapeable so i had to codify information included on creators' pages - i used an excel 'find' function to classify these to save looking at almost 6000 webpages myself! This means there will be errors and some people are working across genres, but with this amount of data (and level of funding!) taking this into account isn't possible.

As you can see video and film is the most common activity. Unsurprisingly for a web-based patronage network, tangible (e.g. crafts and DIY) and performance-based products don't feature strongly. Colleagues and I have recently written a chapter on scopic regimes in skateboarding media, and there's the potential to explore this area here.

The geography of genres is interesting too. The diagram below shows connections between creators in each country and the genres they work in (larger version here).

 5802 creators aggregated by country, coloured by continent.&nbsp;

5802 creators aggregated by country, coloured by continent. 

Using location quotients i examined the data for areas of specialism (5802 minus the country-less creators and undermined genre = 3792 creators). Because the total is so low for some places, i'm ignoring the scores for countries with fewer than 10 creators (this leaves 23 countries). Despite Video and Film accounting for the largest proportion of genres (28% in this sample) only the Netherlands, UK, Portugal, Australia, Italy and USA with LQ scores over 1. Drawing and Painting is found more evenly across 17 countries.

To understand variance between countries more clearly I standardised the data and ran a principal component analysis.

Screen Shot 2015-07-18 at 14.56.39.png

The variance is interesting and something i need to unpack as the analysis continues. Most interestingly, for me at least, is the loading plot which resembles some of the patterns seen in the genre data in the social network analysis. I need to examine this more closely.

More on patrons in my next post.