Sorry for the belated report, I’ve been catching up and a little unwell. I was lucky enough to participate in a panel entitled “We Need To Talk About TED”, which was chaired by Chad Orzel and featured (in programme order) Dr Sarah Dillon, Andrea Phillips, Vanessa Harden, and me! The panel was split into two groups based on who had and who had not given a TED talk (and for the record, I haven’t) so we focussed on the experiences of people who had undertaken TED talks and, from the academic perspective, the impact, reach and curation issues of TED talks. TED stands for Technology, Entertainment and Design and the basic idea is that TED is used to disseminate “Ideas worth sharing” and has been for about 30 years. In recent years, electronic delivery has increased the reach of TED and TED talks have a very high impact across the video streams of the world.
From those who have attended, a big TED event is an absolute blast. You do a full rehearsal the day before, it’s high production, you get a big haul of high-quality swag, and it’s a reputation booster. Vanessa, an artist who works in interventions (altering spaces to get people thinking and interacting with the space in new or other ways), made the really good point that there was such a big divide between those who had been part of TED and those who were looking in from it, as an academic perspective. This is a great point, because it helps us to remember that TED talks (the big ones) are something that you strive to get to present. The main conference is limited to a smallish number of people and is very expensive, so you have to be showing something pretty good to get there. Chad commented that, from his experience, TED talks were pretty well-balanced in terms of demography and inclusion and this was reiterated by other panellists as well.
My issue stems from my outsider’s view on TED, looking at what we are seeing, the impact of these highly professional and appealing videos and what has come from them. I’m a bit of a numbers guy so let’s look at the numbers. (The other panellists should feel positively inspired to jump and down in the comments to get their points further exposed too. Please!)
Overall, if we look at the top 20 TED talks, they are roughly 15:50 minutes long (on average), which is longer than the usual TED average of about 14:10, and they have, in total, over 88 million views. If people are watching them to the end, then that means we have spent:
watching the top 20 TED talks alone. That’s quite a lot of time, so are all of these videos worth watching? Well, a few of them deal with technology that has … for what of a better word … never quite worked outside of the TED talk or delivered on the scale promised. If we look at SixthSense technology, people have spent about 400 years watching videos on something that has never quite delivered. Why are these videos still in the top 20?
There’s something called the Matthew Effect, which basically means that success breeds success (you might know it as ‘accumulated advantage’). Because these talks made it on to the top 20 a few years back, they get watched more because they have made it on to the top 20. Now, within the top 20, we do some movement up and down the rankings, and, yes, the SixthSense stuff is dropping (about 5 places over the last 2 years) but it’s still there because you look for a list of top 20 videos (this one is from 2012) and that’s what comes out. New videos do make it on and make it higher up – the video on things you didn’t know about orgasm is (quell surprise) 5 places higher after 2 years – but, in terms of a video, success breeds success because of the way that our recommender and listicle based web culture works.
So, even thought TED can distribute ‘ideas worth sharing’, there is inertia in getting rid of “ideas that would have been worth sharing had they actually worked properly”. Don’t get me wrong, the tech world is full of “tried it, didn’t work, try something else” but the problem here is that the well-produced and lingering TED videos can make something look current and attractive when, in reality, it’s over and done. As far as I know (corrections below please), there is no TED expiry mechanism.
TED really shines at Entertainment and Design because TED is a platform for 15 minutes of inspirational wonder – there is no doubt that some of the best talks on the Internet can be found at TED. (My favourite is Dan Pink, to no-one’s surprise.) But as those who had given TED talks on the panel noted, if your rehearsal was too technical in nature, it was time to remove slides and keep the message moving along a little more simply. And this is where we started to head into murkier waters with technological issues.
There’s nothing special about technology and giving inspirational talks about technology, as long as we’re clear on what we’re talking about. There is a big difference between talking about something that works and talking about something that might work. An idea in entertainment and design can be realised by many people and shared widely. An idea based on technology and the development of a key platform will need that platform, which is where SixthSense appears to have fallen down.
So the next thing about TED that we really have to talk about is that it’s always going to be light on the detail of technology and it’s easy to confuse wishes with reality, presentations with progress. Not a problem if we’re clear but a real problem if we’re not.
The final thing that I wanted to address was the curation of TED because not every TED talk goes up on the web. It’s a big deal to make it up onto the TED site, or associated YouTube channel, especially if you’re coming from one of the TEDx local talks or something other than the big talks. And one thing that happens in that curation is a demographic shift. Suddenly, the balanced environment of the conference turns into a video presence that is only 27% women, and a vast over-representation of Californians. There also appears to be a (questionably subtle) bias against certain issues, including women’s reproductive health and some less conventional approaches. If we were getting all of the TED talks then we would see what was going on. Instead, what we see is a snapshot that appears to refine the TED message seen by people at the talk into something that we get to see from the outside. And that’s not ideal.
So, if TED inspires you, that’s great but you need to keep in mind the list of things I outlined above, which comes from the whole group of people on the panel:
- TED talks are popular, at least in some part, because they are popular and there’s no clear way of telling if a TED talk is a good presentation for an idea that is proven, yet to be proven or has in fact been found to need a lot more work.
- TED is great for the E and the D but the technology can be light on. That’s fine if you’re looking for inspiration but not always so great if you’re looking for platforms that work and work now.
- The curation of TED appears to have some bias issues and, whether that’s a deliberate or an accident, it’s skewing TED to being a bit of a “things Californian guys like” and, while that isn’t necessarily a problem, it doesn’t appear to reflect what TED is when you get inside.
- TED may not set out to be elitist but its exclusivity and curatorial approach appears to cultivate an elite aspect that could fall into that trap.
It was a fascinating panel to be on and I learned a great deal from talking to the other panellists! I’m sure that they will pull me up on things that I missed.
R U Ok? Day (September the 11th) is coming up soon, with its focus on reaching out and starting conversations with people that you think might not be ok, or might benefit from a friendly conversation. It’s a great initiative and, as someone who has struggled with mental illness, I’m so happy to see us talking openly about this. For me to out myself as having suffered with depression is no big thing, as I discuss it in other parts of the ‘net, but I realise that some of you might now look at what I do and what I say in a different light.
And, if you do, I have to tell you that you need to change the way that you think about these things. A very large number of humans will go through some form of mental issue in their lives, unsurprisingly given the levels of stress that we put ourselves under, the struggle some people have just to survive and the challenges that lie ahead of us as a rather greedy species on a finite globe. So, yes, I’ve suffered from depression but it is an illness. It is treatable and, when it is treated and managed, then you can’t tell that I have problems. In fact, like many people with the problem, even when I’m suffering, you wouldn’t really know. Nobody asks to get mentally ill so stigmatising, isolating and discriminating against people with a treatable mental condition is not just wrong, it’s pretty stupid. So let’s get beyond this and start talking, openly.
That’s where RUOK? is great because it gives you a day and some agency to reach out to someone who seems a little … off and ask them if they’re ok. Trust me when I say that 99% of them will appreciate it. Yes, 1% might give you some grief but if I knew a bet would pay off 99% of the time, I’d take it. The web site has some great tips for starting conversations so please read them if you’re thinking about doing this. (Pro tip: starting a conversation with “You should just cheer up” is not a great way to start. Or finish. In fact, just scratch that and try again.)
I am very open with my students, which I know some people think is potentially unprofessional, and I am a strong believer in cognitive apprenticeship. We are, pretty much, all the same in many respects and me pretending that everything I do comes fully formed and perfect from my amazing brain is a lie. My wisdom, such as it is, is the accumulated memory of the mistakes I’ve made that haven’t killed me yet. My students need to know that the people around them struggle, wonder, stress out and, quite frequently, rise above it all to keep on doing wonderful and beautiful things. I am still professional but I am honest and I am human.
I want to share with all of you something that I wrote on the death of Robin Williams, which I’ve edited slightly for language, but it’s been shared a lot over my other social feeds so it obviously resonates with people. However, many of my students won’t have seen it because I keep my private social life and ‘work’ social media separated. So here it is. I hope that you find it useful and, if you need help, maybe nudges you to help, and if you know someone you’re worried about, it inspires you to ask them “R U OK?”
Mental illness is a poisonous and weird thing. If your eyes changed function, you’d see things differently. When your brain changes function, everything gets weird – and the only impression you have of the perceptual world is suddenly flawed and untrustworthy. But it’s a biochemical issue like diabetes – regulatory systems that aren’t working properly and cannot just be “got over” by thinking happily. Ask a diabetic whether they’ve “really tried” to handle their sugar and see how far that gets you. :)
I wrote something, years ago, that I’ve never posted, to try and explain why some people just can’t stay. The nastiest thing about mental illness is that it can show you a world and a way of thinking that makes suicide apparently logical and, even more sadly, necessary. If you saw that world, then maybe you wouldn’t stay either. This doesn’t make it easier on the survivors but it’s important to recognise the role that an actual illness plays here. That f***ing ba***rd, cancer, takes people from us all the time but it at least has the decency to wield the knife itself. Depression puts the knife in the hands of its victim and makes it look like calculated agency, which hurts the people left behind even more.
There is no magic bullet for helping people with mental illness. Some need visible support. Some need solitude. Some need to work. Some drown in it. That’s because mental illness affects people, in all of their variety and their glorious irrationality, and I am no more a poster child for depression than anyone else. I can’t even tell you how to help me and, given how much I communicate, that’s the most irritating thing of all. But I do know that the ongoing support of caring people who are watching and listening makes a big difference and those of you who are aware and supporting, you keep up that good work! (And thank you, on behalf of the people who are still here because other people helped.)
It’s a sad day with Robin WIlliams passing but this is only a part of him. It’s a sad and mad part of him and I wish it hadn’t happened but I won’t let it define him, because his struggles were a part of him and his contribution to laughter and joy were so much greater. The least I can do is to see past his ‘mental diabetes’ to celebrate his actual talent and contribution. And offer my deepest sympathies and condolences to his family and friends.
Rest well, Robin.
WorldCon 2014: LonCon3. Why I was wrong not to come sooner and why you should be here next time. #LonCon3 #WorldConPosted: August 19, 2014
I’ve been at the World Science Fiction Convention (WorldCon) for the past few days – as anyone with a Twitter account knows if you’ve been following the #LonCon3 tag. Before I say anything else, let me be up front and say that I was pretty stupid not to have made the effort to come to a WorldCon before but, and this is a big but and I cannot lie, you shouldn’t change your mind on WorldCon and SF Cons in general from listening to me, because if you’re not quite sure about them then you should probably look deeper. SF Cons don’t need the external approval of the temporary visitor to make them worth doing so let me remove myself from the role of anthropological validator.
I am, and have been for years, someone who enjoys Science Fiction and Fantasy. I grew up on it and viewing and reading SF&F was an essential part of my fascination with reading. But I had never considered myself a fan as I had some weird ideas in my head about what fannish culture is. And, like any stereotype, I was stupidly, badly, and offensively wrong. Of course I’m a fan. I like things in the SF&F domain and I follow them, think about them and (occasionally) study them. So I’m a fan.
But I’m not validating this space because (a) it’s offensive to think I have that level of authority and (b) it comes pre-validated by the 10,000 people who showed up. This is a strong community and, as I discovered, it’s a diverse, accepting, warm and friendly community, full of interesting people. Are there some jerks? Yes. But far fewer than I’ve run into outside of this space so let’s not fool ourselves into thinking that this is some sort of amazing jerk space. You’ll meet more jerks in the average pub and you won’t be able to talk to them about something that fascinated you when you were 12. :)
So let’s go through some reasons why, if you like any SF&F, you should try to make it to one of the big cons.
- WorldCon has spent a lot of time making this a safe, inclusive and accepting space.
Some years ago, and in the news recently, we’ve heard a lot about people being creepy at Cons and semi-powerful people who abuse that power. This year, I saw a couple of jerks, but WorldCon has a Code of Conduct that clearly accepts non-harmful behaviour but totally rejects anything that cause problems for other people. I’ve seen it in action and it works, swiftly. People have a right to feel safe and the new Code of Conduct policies guarantee that.
Having said that. this is such a queer/trans/body-shape/everything accepting space (not tolerant, because people aren’t putting up with it, they’re accepting it) that it’s hard to imagine a place that would be more so.
This is a city of 10,000+ people where everyone is accepted unless they are being an arse. This is, thanks to oversight and fantastic volunteers, a very safe space.
- Someone here is as interested in your interests as you are.
I have lost track of the discussions I’ve had with total strangers in lifts, escalators, walking around and the amount of information we’ve shared. If you like it, someone else does too. Better still, they may know something about it you didn’t and you can enjoy talking about that.
Hate something? I guarantee that you can have a bile-rending discussion over a beer with someone who also thinks that (insert popular thing here) is a blight upon the history of our species. But these are fun discussions, controlled and shared knowledge between equals. I saw very little knowledge snobbery here.
- You will get to meet, listen and talk to the giants of the field, other fans and experts as they talk about everything.
LonCon3’s problem was not that there was not enough to see in the way of signings, discussions, panels and talks, it was that it was impossible to fit everything into one track for all of these days so collisions in the schedule were inevitable. But if you like Charles Stross, well he’s talking over there. Big fan of Culture politics? There’s a panel for that. Do you like Karaoke with SCA members? That’s an event from 11pm.
And everyone is just wandering around and you can talk to them if you want to. Better yet, do you think that you have something to say – let them know and you might be able to get onto a panel or a talk.
- The people are lovely in the vast majority.
I have spent the last four days being amusedly tolerated in my wide-eyed cluelessness by a large number of lovely people but, even when bumping into someone, the mutual apologies have almost become farcical. Yeah, there are a very small number of clueless jerks but I had worse experiences getting here on the Tube than I did for the whole Con.
I was fortunate enough to come here with my very generous cousin, Curt, and quickly caught up with Australians, but I also fell in with a very nice (if slightly mad) group of Canadians who have welcomed me (or at least been astoundingly polite about not getting rid of me.) Everyone is here to have fun and enjoy the community. The code of conduct covers those who can’t play well.
- There is a lot of stuff here.
There is art, books, items, t-shirts – the Dealers’ room is regularly travelled and things are at a reasonable price. There’s a bar and lots of food so that if you are eating and drinking at weird hours then you are catered for. Better still, you will run across people who know where to find what you want if it doesn’t happen to be here.
- It is a whimsical, beautiful space. Seen the Lego Movie? The random and ephemeral beauty of Cloud Cuckoo Land captures the sense of this very well. Want to dress up as Holmes and Watson but you’re two young women in love and walk around hand-in-hand? This is your place. Do you like Tiki Daleks? Welcome. Every so often, bubbles drifted through the space and lent a strange and alien air to the proceedings.
- They are very kind to newcomers. I stumbled around slightly lost and looking for all the world like some kind of alien anthropologist who had finally deigned to look at the ants’ nest rather than following random ants. But people realised that my curiosity and questions came from a desire to understand and, as I said, now I’m a fan.
I was expecting it to be good, but I wasn’t expecting it to be great. And if I’d known how good it was, I would have started making time to come years ago. Three friends of mine made it possible for me to undertake this but two of them couldn’t make it this year and I miss them, firstly because it would have been great to see them, but also because I know how much delight they would have taken at me finally getting my head out of my arse on this and realising how good it was. Thank you, Curt. Thank you, D & J.
I certainly hope to be back!
Digital Humanities: Reflections on distant reading and why the ability to fly hasn’t stopped us walking.Posted: August 17, 2014
One of the themes explored in the Digital Humanities is often “what exactly do we mean by Digital Humanities” because everyone asks and there are any number of self-described skeptics who seem to have an inability to add any new categories to their knowledge hierarchies. We’re studying the intersection of traditional computing and humanities so we’re asking the old question of “where does the desert end” which is only answered locally and specifically, rather than globally and generally. But a major fear for Humanists that came up during the week I was in Maryland was the threat of a colonising external force that would fundamentally alter what Humanists did until it was unrecognisable. I’m going to talk briefly about my view of digital humanities as a parallel augmentation, rather than a displacing colonisation.
Many areas of Humanities use the notion of close reading, where the text is carefully read and interpreted as part of a sustained effort. While this is exemplary for extracting themes and really getting into the work it doesn’t scale up well. We keep producing things to read and there is a limit as to how many things you can close read. This is where distant reading can come in, because it scans works thematically and syntactically, and provides an aggregate or abstraction to the reader. This is scalable and fast, because we can computerise it, but it risks inaccuracy, shallowness and is guaranteed to have the bias of the analysis software.
Let me step back and talk about travel for a moment. We started (well, by we, I mean bipedal humans) moving around on our feet. Then we did things with animals – in a vehicular sense – then the wheel, then lots of wheels, plus animals, plus betting – and that’s how we got the Colosseum. At some point, we stopped trying to put petrol into animals (who kept exploding) and tried it in cars intend. Suddenly we could zoom around, which widened our stride, but had the downside of enabling Italian Futurism at the start of the 20th century which led to all sorts of odd things and the declaration of war as the “great hygiene” until a lot of them died in the Great War – seriously, Marinetti, what were you thinking? (As a side note, the Futurist Cookbook is worth reading because it’s very Heston Blumenthal, just 90 years beforehand.)
Then we developed planes and the journey that took months on foot, weeks on animal and days by car, could take hours. But we never stopped walking, although we could now use our more advanced techniques to walk in new places and ultimately go further.
I feel exactly the same about close and distant reading. There are now (hooray) too many books on most subjects for any person to read in their life, let alone in a span to allow detailed analysis in a timely fashion. But this doesn’t mean we have to stop close reading. It means that we can look into topics and areas, refine our distant reading and visualisation, and then drill down once we’ve landed somewhere. Better still, distant reading allows us to link areas of close reading that may not be apparently connected – we can fly to a new place to explore that will develop the knowledge we already have.
Personally, I’d love it if the Humanities came and did a bit of colonisation in Computer Science, but I can completely understand why the reverse is culturally confronting. And I can also understand the many trad CS people who would also feel threatened by a counter-colonisation – although I probably don’t agree with their reasoning.
Going to a course like this is always good for my thinking as it requires me to switch gears and lens to get things done. I strongly recommend stepping out of the comfort zone of your own discipline when you can as it gives you extra knowledge and some valuable perspective.
Humanities Intensive Learning + Teaching, Day 5, Maryland Institute for Technology in the Humanities, #hilt2014Posted: August 17, 2014
Sorry for the delay in completing this – it has been crazy and I prefer to write directly into the live blog, which means a network feed, as I just find it easier to put things together this way. (It’s only been a week (ish) anyway.)
Today (well, then), we looked at modularity and how we could break networks into separate communities. This is important because it helps us to be able to see structure. The human eye is a great tool but it often needs help and modularity is a good way to do this. We have strong ties between components in directed networks (unidirectional) because we have taken the time to say that the link goes this way. We have weak ties in undirected networks because there is no solidity to the association form one side to the other. The more connected something is, the more strongly tied it is to the things it connects to so, when we hunt for communities, we want to take away the least number of connections to produce the largest number of connected communities.
Let’s look at the Hamlet graph (after Moretti) with some colouring added for the various communities and all the connections left in.
Now let’s take out the intra-community links.
Now we can suddenly see, in terms of dialogue exchanges and interaction, how isolated Hamlet actually is. The members of the court are one community. The invaders are another. Horatio, the proxy for the Danish state, is deeply involved in statehood. Hamlet, however, is more heavily connected to the characters who help with his dawning realisation that something has gone awry. No wonder he goes mad, he’s the Nigel Nofriends of Medieval Denmark, with only the insane Clown Posse and a group of second-rate actors to keep him company.
After this we spent some extensive time working on producing prettier looking graphs with Gephi and using more advanced algorithms to help us to represent what’s going on. We looked at the Twitter data for the conference and came up with this:
This shows the separate groups of Twitter user communities who used the HILT2014 hashtag. My small cluster is the small purple arrow bottomish left. We’ve also ranked the size of the data labels based on the significance of that person in the Twitter information chain, using the PageRank algorithm. So, Jim McGrath wins Twitter for HILT 2014! We also look and how important certain links were in the network by looking at edge betweenness to determine which links are used by the most shortest paths and then use this to identify important connected components.
Lots of good hands-on stuff and a very interesting course! I learned a great deal and have already started using it elsewhere.
Humanities Intensive Learning + Teaching, Day 4, Maryland Institute for Technology in the Humanities, #hilt2014Posted: August 8, 2014
Or, the alternative title, “The Play Formerly Known as Hamlet”. Today had a lot of fascination discussion where the bipartite nature of our class network became apparent, in terms of the the majority of the class who were Digital Humanists and really understood a detailed close reading of Hamlet – and your humble narrator.
Today we talked about modularity, which allows you to break your network up into separate modules, which makes a lot more sense when you call those modules things like communities, groups or clusters. Can your network be split up into areas where some groups are more connected to each other than they are to adjacent nodes? By doing this, we are trying to expose structural features of the network and, in particular, unexpected aspects of network structure – is there some set of nodes that have a role in the network that we don’t understand? If we talk about this in terms of people, if our social network is made up of small communities with connections between the communities, then we would expect gossip to spread more rapidly inside a community than between the communities. Simple, eh? The approach we take depends upon comparing the structure we have with a model of a random network using the same number of links.
Once we’ve done this, we can use this in a tool, such as Gephi, to clearly illustrate the groups. Here’s a grumpy Dane in illustration.
I’ve coloured and tagged the network to show the key players in Hamlet, based on Moretti’s analysis of Hamlet, which attached unweighted connections between participants who had direct conversations. Now, we’ve made the size of the nodes reflect how relatively important (in terms of betweenness, the number of paths that must go through this person). If we did this in terms of PageRank, another measure of the relative importance of nodes, based on their connectivity, the nodes in green would jump up in size. But what you should note is that Hamlet and Horatio are roughly the same size and, while Hamlet is much more connected to everyone (quelle surprise, given the play’s named after him), the only thing that we lose if Hamlet disappears is that we no longer can hear from the Insane Clown Posse, Lucianus and the Lord. In purely connected terms he doesn’t appear to be that important. We’d obviously lose a lot of the text if he disappeared but how important is Hamlet in Hamlet?
This led to a lot of discussion in class as to the validity of the original Moretti pamphlet, much of which is core to the entire discussion of Digital Humanities. How valid is any conclusion from a network model such as this when the innate nature of the capture process (to provide the capta) may completely remove the ability to draw certain conclusions? The class discussed the impact on perceived and invisible observers, who strictly don’t have dialogue with each other but potentially have an impact upon other characters and their narrative interactions. (The Moretti pamphlet may be found here.) Moretti had a lot to say about Horatio as a proxy for the state and was very interested in the closeness of Hamlet to everyone else, but (later on) we ran some random network experiments and it turned out to be the type of connections in the network (the clustering coefficient) that was more interesting than the closeness.
We then moved on to a discussion of a number of useful metric for networks, including the clustering effects that tend to indicate intention in the system we’re studying (there’s no real reason for a random network to form a small number of unique clusters unless you tune for it.) We also discussed the Small Worlds of Watts and Strogatz (1998) where you have cliques of nodes (tightly connected clusters) linked together by a smaller number of links, characterised by a power law distribution and a higher clustering coefficient (very basically).
We generated some random graphs to see if we got the structure we saw in Hamlet as noted earlier. Why? Because looking at Hamlet and drawing out information from the structure only has validity if (1) the model is accurate and (2) the situation couldn’t have easily arisen by chance. As noted, we generated a random graph for the same number of nodes and it had a similar average path length and identical diameter – but very different clustering coefficients! So it turns out that Hamlet wasn’t written by a monkey after all.
The final part of the session was on dynamic networks. This is the study of networks that change over time and we worked with some data that showed people’s association with an area over time. We could look at this in terms of change in time or in terms of progress through a corpus (chapters are as useful as dates here). What you want is a start date and an end date for the feautres in your network – when should the feature be there and when should it not be there anymore? It turns out that Gephi is quite convenient here, because you can merge a start and end time and end up with a time interval. Not bad, Gephi, not bad. Now we can see when things are active – great if you’re looking to see when students are active in forums or activities, for example. Here’s an example of the difference in the test network between the first and second halves of 2009, with all nodes of zero degree (nodes with no links) removed.
We then played around a lot with a vide variety of animations over time, including ongoing calculations, colour and shape alterations. Basically, it was rather hallucinatory by the end but that may be the Dr Pepper talking. We even got to see the sparklines (Zelchenko/Tufte) for some extra visualisation goodness!
This is one of those classic separations between the things we visualise for ourselves to help us to work out what’s interesting about a dataset and what we would visualise for presentation, especially to non-discipline people. There’s a really big difference between what scientists want to see and what other people want to see – and if we don’t realise that, then we risk either having presentations that don’t communicate enough information or we kill people with minutiae. Look at this – how useful is it for you?
Another good day but I think there are some very full brains!
Humanities Intensive Learning + Teaching, Day 3, Maryland Institute for Technology in the Humanities, #hilt2014Posted: August 8, 2014
Today was going to be a short day because we had the afternoon off to go and do cultural things. (I took the afternoon to write papers and catch up on work. I tend to work in both timezones when travelling because otherwise work will eat my head.) Today we explored a lot of filtering in Gephi, which was … interesting and best done in practice. Which we did. So, hooray!
We looked at Multimodal Network Projection throughout the day but I’ll come back to that. We started looking at other ways of determining the important and dependant nodes in a network, starting with the HITS algorithm, which identifies Hubs and Authorities in your network.
Then we moved into the wonderful world of PageRank, Citation networks and how all of these things work. PageRank is, fundamentally, how Google works out which pages to give you. You won’t be able to find out the details of the way that Google runs PageRank because gaming PageRank to serve up pages that you probably didn’t want to see is very big business and highly irritating. Search Engine Optimisers (SEOs) spend a lot of time trying to reverse engineer the algorithm and Google spends a lot of time tweaking it. It’s good we’ve solved all the important problems in the world so we can spend time on this.
Back to PageRank. PageRank looks at the number of links going to a node and what the quality of these links is in order to work out which the most important nodes (pages in the case of Google search) are and, hence, which ones you want. In Digital Humanities, you can use this to identify the most significant element of your model – which, by the way, quickly establishes that Hamlet is not as important as you think. Take that, Hamlet! Want more detail on PageRank? Look here.
In Citations, we want to see how is citing which reference, which is straightforward. In Co-Citation networks, we want to measure how often two documents are cited together. There are many reasons for looking at this, but it helps to detect things like cronyism (someone citing a paper because they like someone rather than because the information is useful). As we discussed before, the Matthew Effect comes in quickly, where frequently cited papers get even more frequently cited because they must be good because they’re cited so frequently. (Tadahhh.)
We also looked at a rather complicated area of multimodal projection, which is going to need some set-up. If you have a set of authors and a set of publications, then you can associate authors with publications and vice versa. However, this means that the only way for two authors to be connected is by sharing a publication and similarly for two publications sharing an author. This is a bipartite network and is very common in this kind of modelling. Now, if we make it more complicated, by moving to a conference and having Authors, Papers and Presentation Sessions, we now have a tripartite network and this becomes very hard to visualise.
What we can do is clean up this network to make it easier to represent by hiding some of the complexity in the connections between nodes. Let’s say we want to look at Authors and Presentation Sessions. Then, while the real network is Authors connected to Papers connected to Presentation Sessions, we can hide the Papers with a network link that effectively says “connects this author via a presentation to this session” and suddenly our network looks like it’s only Authors and Sessions. This level of visual de-cluttering, which is dimensional reduction for those playing along at home, makes it easier for us to visually represent the key information and produce network statistics on these simpler graphs. It’s also a natural fit for triple-based representations like the Resource Description Framework (RDF) because the links in the network now map straight to predicates. (Don’t worry if you didn’t get that last bit, some people just got very excited.)
Finally, we looked at how we collect information. Is it stuff we just pick up from the environment (data) or is it something that we choose the way that we collect it (capta)? (Capta comes from the word for capture. Data is passive. Capta is active. Take that, Bembridge Scholars!) If you think about it, every time you put your data into a spreadsheet, you are imposing a structure upon it, even down to which column is which – it’s not technically data, it’s capta because your interpretation alters it before it even reaches the analysis stage. When it comes to the network that you draw, do you care about the Proximities of elements in your network (location, membership or attitude), the Relations in your network (role, affective, perceptual), the Interactions or the Flows? All of these are going to change what the nodes and edges (links) represent in the network.
The simple rule is that entities are connected by relationships (For those who think in tuples, think “subject, predicate, object” and get your predicate on!) However you do it, you have to pick what’s important in your data, find it, capture it, analyse it and present it in a way that either shows you cool things or supports the cool things that you already know.
A lot to cover today!
After the session, I did some work and then headed off for dinner and board games with some of the other people from the workshop. A nice relaxing night after a rather intense three days.