Good riddance to 2020

Christmas is very nearly here, and a very welcome thing it is, too. After a streak of mild and rainy days our snow is largely gone, and frankly it’s depressingly dark right now, so a bit of Christmas cheer is just the thing to wash away the dust and grime of this mess of a year. The December solstice was yesterday, so technically the days are growing longer already, but of course it’s going to take a good while before that becomes actually noticeable. 

Things seem to be looking up on the COVID front as well, with new cases on the decline in Oulu and the start of vaccinations just around the corner. I’ve been voluntarily living under lockdown-like conditions for a few weeks now: no band rehearsals, no coworker lunches (except on Teams), no pints in pubs, only going out for exercise and shopping and keeping the latter to a minimum. I hope this is enough for me to spend Christmas with my parents relatively safely; it’s going to be a very small gathering, but at least I won’t have to eat my homemade Christmas pudding all by myself, which might just be the death of me. 

This blog post will be the last work thing I do before I sign off for the year. I was going to do that yesterday, but decided to take care of a couple more teaching-related tasks today in order to have a slightly cleaner slate to start with when I return to work. There will still be plenty of carry-over from 2020 to keep me busy in January 2021; most urgently, there’s a funding application to finish and submit once we get the consortium negotiations wrapped up, as well as an article manuscript to revise and submit. I got the rejection notification a couple of weeks ago, but haven’t had the energy to do much about it apart from talking to my co-author about what our next target should be. 

Improving the manuscript is a bit of a problem, because the biggest thing to improve would be the evaluation, but the KDD-CHASER project is well and truly over now and I’ve moved on to other things, so running another live experiment is not a feasible option. We will therefore just have to make do with the results we have and try to bolster the paper in other areas, maybe also change its angle and/or scope somewhat. I should at least be able to beef up the discussion of the data management and knowledge representation aspect of the system, although I haven’t made much tangible progress on the underlying ontology since leaving Dublin. 

I have been working on a new domain ontology though, in the project that’s paying most of my salary at the moment. Ontologies are fun! There’s something deeply satisfying about designing the most elegant set of axioms you can come up with to describe the particular bit of the universe you’re looking at, and about the way new incontrovertible facts emerge when you feed those axioms into a reasoner. I enjoy the challenge of expressing as much logic as I can in OWL instead of, say, Python, and there’s still plenty of stuff for me to learn; I haven’t even touched SPARQL yet, for instance. Granted, I haven’t found a use case for it either, but I have indicated that I would be willing to design a new study course on ontologies and the semantic web, so I may soon have an excuse… 

Another thing to be happy about is my new employment contract, which is a good deal longer than the ones I’m used to, although still for a fixed term. On the flip side, I guess this makes me less free to execute sudden career moves, but I’d say that’s more of a theoretical problem than a practical one, given that I’m not a big fan of drastic changes in my life and anyway these things tend to be negotiable. In any case, it’s a nice change to be able to make plans that extend beyond the end of next year! 

Well, that’s all for 2020 then. Stay safe and have a happy holiday period – hope we’ll start to see a glimmer of normality again in 2021. 

Summing up the AI summit

The end of the year is approaching fast, with Christmas now barely two weeks away, but I managed to fit in one more virtual event to top off this year of virtual events: the Tortoise Global AI Summit. To be quite honest, I wasn’t actually planning to attend – didn’t even know it was happening – but a colleague messaged me the previous day, suggesting that it might be relevant to my interests and also that the top brass would appreciate some kind of executive summary for the benefit of the Faculty. Despite the short notice I had most of the day free from other engagements, and since the agenda did indeed look interesting, I decided to register and check it out – hope this blog post is close enough to what the Dean had in mind! 

I liked the format of the event, a series of panel discussions rather than a series of presentations. Even the opening keynote with Oxford’s Sir Nigel Shadbolt was organised as a one-on-one chat between Sir Nigel and Tortoise’s James Harding, which felt more natural in an online environment than the traditional “one person speaks, everyone else listens, Q&A afterward” style. Something that worked particularly well was the parallel discussion on the chat, to which anyone attending the event could contribute and from which the moderators would from time to time pick questions or comments to be discussed with the main speakers. Overall, I was left with the feeling that this is the way forward with virtual events: design the format around the strengths of online instead of trying to replicate the format of an offline event using tools that are not (yet) all that great for such a purpose. 

The keynote set the tone for the rest of the event, bringing up a number of themes that would be discussed further in the upcoming sessions: the hype around AI versus the reality, transparency of AI algorithms and AI-based decision making, AI education – fostering AI talent in potential future professionals and data/algorithm literacy in the general populace – and the need for data architectures designed to respect the ethical rights of data subjects. Unhealthy power concentrations and how to avoid them was a topic that resonated with the audience, and it shouldn’t be too hard to think of a few examples of such concentrations. The carbon footprint of running AI software was brought up on the chat. Perhaps my favourite bit of the session was Sir Nigel’s point that there is a need for institutional and regulatory innovations, which he illustrated by way of analogy by mentioning the limited company as a historical example of an institutional innovation. Such innovations are perhaps more easily overlooked than scientific and technological ones, but one can hardly deny that they, too, have changed the world and will continue to do so.

The world according to Tortoise

The second session was about the new edition of the Tortoise Global AI Index, which ranks 62 countries of the world on their strength in AI capacity, defined as comprising the three pillars of implementation, innovation and investment. These are further divided into the seven sub-pillars of talent, infrastructure, operating environment, research, development, government strategy and commercial, and the overall score of each country is based on a total of 143 individual indicators. The scores are normalised such that the top country gets an overall score of 100, and it’s no big surprise that said country is the United States, as it was last year when the index was launched. China and the United Kingdom similarly retain their places as no. 2 and no. 3, respectively. China has closed some of the gap with the US but is still quite far behind with a score of 62, while the UK, sitting at around 40, has lost some of its edge over the challengers. Canada, Israel, Germany, the Netherlands, South Korea, France and Singapore complete the top 10. 

Finland is just out of the top 10 but rising, up three places from 14th to 11th. According to the index, Finland’s particular forte is government strategy, comprising indicators such as the existence of a national AI strategy signed by a senior member of government and the amount of dedicated spending aimed at building AI capacity. In this particular category Finland is ranked 5th in the world. Research (9th) and operating environment (11th) can also be counted among Finland’s strengths, and all of its other subrankings (talent – 16th, commercial – 19th, infrastructure – 21st, development – 22nd) are solidly above the median as well. Interestingly, the US, while being ranked 1st in four categories and in the top 10 for all but one, is only 44th on operating environment. The most heavily weighted indicator here is the level of data protection legislation, giving countries covered by the GDPR a bit of an edge; 7 of the top 10 in this category are indeed EU countries, but there is also, for instance, China in 6th place, so commitment to privacy is clearly not the whole story. 

There was some good discussion on the methodology of the AI index, such as the selection of indicators. For example, one could question the rather heavy bias toward LinkedIn as a source of indicators for AI talent. Another interesting point raised was that while we tend to consider academics mainly in terms of their affiliation, it might also be instructive to look at their nationality. Indeed, the hows and whys of the compilation of the index would easily make for a dedicated blog post, or even a series of posts, but I’ll leave it for others to produce a proper critique. For those who are interested, a methodology report is available online. 

From the Global AI Index the conversation transitioned smoothly into the next session on the geopolitics of AI, where one of the themes discussed was if countries should be viewed as competing against one another in AI, or if AI should rather be seen as an area of international collaboration for the benefit of citizens everywhere. Is there an AI race, like there once was a space race? Is mastery of AI a strategic consideration? Benedict Evans advocated the position that to talk about AI strategy is to adopt a wrong level of abstraction, and that AI (or rather machine learning) is just a particular way of creating software that in about ten years’ time will be like relational databases are today: so ubiquitous and mundane that we hardly pay any attention to it. This was in stark contrast to the view put forward in the beginning of the session that AI is a general-purpose technology akin to electricity, with comparable potential to revolutionise society. The session was largely dominated by this dialectic, but there was still time for other themes as well, such as the nature of AI clusters in a world where geographically limited technology clusters are becoming an outdated concept, and the role of so-called digital plumbing in providing the essential foundation for the success of today’s corporate AI power players.

Winners and losers

The next session, titled “AI’s ugly underbelly”, started by taking a look at an oft-forgotten part of the AI workforce, the people who label data so that it can be used to train machine learning models. It’s been estimated that data labelling accounts for 25% of the total project time in an ML project, but the labellers are, from the perspective of the company running the project, an anonymous mass employed through crowdsourcing platforms such as MTurk. In academic research the labellers are often found closer to home – the job is likely to be done by your students and/or yourself, and when crowdsourcing is used, people may well be willing to volunteer for the sake of contributing to science, such as in the case of the Zooniverse projects. In business it’s a different story, and there is some money to be made by labelling data for companies, but not a lot; it’s an unskilled job that obeys the logic of the gig economy, where the individual worker must buy their own equipment and has very little in the way of job security or career prospects. 

The subtitle of this session was “winners and losers of the workforce”, the winners of course being the highly skilled professionals who are in increasingly high demand and therefore increasingly highly paid. There was a good deal of discussion on the gender imbalance among such people, reflecting a similar imbalance in the distribution of the sort of hard (STEM) skills usually associated with tech jobs. In labelling the gap is apparently much narrower, in some countries even nonexistent. It was argued that relevant soft skills and potential AI talent are distributed considerably more evenly, and that companies trying to find people for AI-related roles may want to look beyond the traditional recruiting pathways for such roles. A minor point that I found thought-provoking was that recruiting is one of the application domains of AI, so the AI of today is involved in selecting the people who will build the AI of tomorrow – and we know, of course, that AI can be biased. One of the speakers brought up the analogy that training an AI is like training a dog in that the training may appear to be a success, but you cannot be sure of what it is that you’ve actually trained it to respond to. 

There was more talk about AI bias in the “AI you can trust” session, starting with what we mean by the term in the first place. We can all surely agree that AI should be fair, but can we agree on what kind of fairness we want – does it involve positive discrimination, for example? Bias in datasets is a relatively straightforward concept, but beyond that things get less tidy and more ambiguous. There is also the question of how we can trust that an AI is not biased, provided that we can agree on the definition; a suggested solution is to have algorithms audited by a third party, which could provide a way to strike a balance between the right of individuals to know what kind of decision-making processes they are being subjected to and the right of organisations to keep their algorithms confidential. An idea that I found particularly interesting, put forth by Carissa Véliz of the Institute for Ethics in AI, was that algorithms should be made to undergo a randomised controlled trial before they are allowed to make decisions that have a serious, potentially even ruinous, effect on people’s lives. 

Data protection was, of course, another big topic in this session. That personal data should be handled responsibly is again something we can all agree on, but there was a good deal of debate on what is the proper way to regulate companies to ensure that they are willing and able to shoulder that responsibility. Should they be told how to behave in a top-down manner, or is it better to adopt a bottom-up strategy and empower individuals to look after their own interests when it comes to privacy? Is self-regulation an option? The data subject rights guaranteed by the GDPR represent the bottom-up approach and are, in my opinion, a major step in the right direction, but it’s also a matter of having effective means to enforce those rights, and here, I feel, there is still a lot of work to be done. The GDPR, of course, only covers the countries of the EU and the EEA, and it was suggested that perhaps we need an international organisation for the harmonisation of data protection, a “UN of data” – a tall order for sure, but one worth considering.

Grand finale

The final session, titled “AI: the breakthroughs that will shape your life”, included several callbacks to themes discussed in previous sessions, such as the growth of the carbon footprint of AI as the computational cost of new breakthroughs continues to increase – doubling almost every 3 months according to an OpenAI statistic. The summit took place just days after the announcement of a great advance achieved by DeepMind’s AlphaFold AI in solving the protein folding problem in computational biochemistry, mentioned already in the beginning of the first session and discussed further here. While it was pointed out that the DeepMind solution is not necessarily the end-all it has been hailed as, it certainly serves to demonstrate that the technology is good for tackling serious scientific problems and not just for mastering board games. The subject of crowdsourcing came up again in this context, as the approach has been applied to the folding problem with some success in the form of Folding@home, where the home computers of volunteers are used to run distributed computations, as well as Foldit, a puzzle video game that essentially harnesses the volunteers’ brains to do the computations. 

There was some debate on the place of humans in a society increasingly permeated by AI systems, particularly on where we want to draw the line on AI autonomy and whether new jobs created by AI will be enough to compensate for old ones replaced by AI. Somewhat ironically, data labeller is a job created by AI that may already be on its way to being made obsolete by advances in AI techniques that do not require large quantities of labelled data for training. One of the speakers, Connecterra founder Yasir Khokhar, talked about the role of AI in solving the problem of feeding the world, reminding me of Risto Miikkulainen’s keynote talk at CEC 2019, in which he presented agriculture as one of the application domains of creative AI through evolutionary computation. OpenAI’s GPT-3 was then brought up as another example of a recent breakthrough, leading to a discussion on how we tend to anthropomorphise our Siris and Alexas and to ascribe human thought processes to entities that merely exhibit some semblance of them. There was a callback to AI ethics here when someone asked whether we have the right to know when we are interacting with an AI – if we’re concerned about AI transparency, then arguably being aware that there is an AI is the most basic level of it. Of things that are still in the future, the impact of quantum computing on AI was discussed, as were the age-old themes of artificial general intelligence and rogue AI as existential risk, but in the time available it wasn’t feasible to come to any real conclusions. 

Inevitably, it got harder to stay alert and focused as the afternoon wore on, and I also missed the beginning of one session because I had to attend another (albeit very brief) meeting, but even so, I managed to gather a good amount of interesting ideas and information over the course of the day. I’m particularly happy that I got a lot of material on the social implications of AI that we should be able to use when developing our upcoming AI ethics course, since so far I haven’t been too clear about specific topics related to this aspect of AI that we could discuss in the lectures. This wasn’t a week too soon, I might add – we’re due to start teaching that course in March, so it’s time to get cracking on the preparations!

Heart of darkness

The news came in yesterday that the university is extending its current policy of remote work and teaching, previously effective until the end of 2020, to the end of May, 2021. Not a huge shock, frankly; it’s what my money would have been on, and I wrote as much yesterday when I was drafting this post, before the announcement came. It doesn’t really change any plans either, since we’ve been assuming from the get-go that our AI ethics course, due to be lectured in the second period of the spring term, will be taught remotely. Still, it’s strange to think that by the end of this latest extension, we’ll have been working from home for more than a year without interruption – and of course there’s no guarantee that things will be back to normal even then, although one may hope that at least some of us will have been vaccinated already. In the meantime, I’ll be getting my flu shot for the coming winter, courtesy of occupational healthcare. 

Speaking of winter, it’s almost November, and as the days grow shorter, I’m reminded of the one redeeming feature of the dreary Irish winter in comparison with the Finnish one: more daylight. Last year and the year before, I “cheated” and only came to Finland for the end-of-year holidays, not long enough to really feel the effects of prolonged darkness – especially since I wasn’t working during the time I spent here and therefore could sleep for as long as I wished. Now, however, I’ve already noticed that it’s getting more laborious to get myself up and running in the morning, and while the turning of the clocks on Sunday brought some temporary relief by making mornings somewhat brighter, it’s not going to last long.

Fortunately, working from home has rendered the concept of office hours even less relevant than it was before the pandemic. I was free to choose my own hours before, but there was still a fairly strong preference to be at the office at more or less the same times as my colleagues, for the social aspect if not for anything else. Now that there’s basically nothing to be gained from being together at the “office” (i.e. at our computers in our respective homes), I’ve gone to sleeping according to what I presume is my natural rhythm, which I suppose cannot be a bad thing healthwise. There are still the meetings, of course, but I’ve mostly managed to avoid having them so early in the morning that I couldn’t trust myself to wake up for them without setting an alarm, although I’m not sure how that’s going to work out when we get to winter proper and there’s barely any daylight at all. 

Before the all-staff email yesterday, I was already thinking that if we do go back to working on campus after New Year, I may well continue to take remote days more frequently than I used to, at least during the winter and especially when it’s very cold. As much as I love a good northern winter with lots of snow, I don’t particularly relish temperatures closer to minus twenty than minus ten, and when you combine that with pitch darkness in the morning, the thought of staying in bed is very tempting. So, once in a while, why not just do that, get up when you actually feel up for it and work from home, since that’s now officially sanctioned by university policy? 

I participated in my very first virtual conference last week, the one-day Conference on Technology Ethics (formerly Seminar on Technology Ethics) organised by the Future Ethics research group at the University of Turku. I didn’t present anything, but the event was free of charge and I figured I might come away with some fresh ideas for the AI ethics course and perhaps even for my research. The conference did not disappoint – particularly the keynote talks by Maija-Riitta Ollila and Bernd Carsten Stahl were very much the sort of thing I was hoping for, and I think I’ll be referring back to them when I get to the work of creating my lecture materials. Everything went reasonably smoothly too, although there were some technical issues with screen sharing on Zoom. There was even a virtual conference dinner in the evening, but I didn’t participate so I don’t know how that worked out in practice. 

The next online event I’m looking forward to is a cultural one: the Virtual Irish Festival of Oulu! As the organisers put it, it’s the first, and optimistically also the last, of its kind: under normal circumstances the festival would have been in the beginning of October and very much non-virtual, taking place in various venues around town and offering music, dance, theatre, cinema, storytelling and workshops over a period of five days. I’m rather annoyed that there’s no proper live festival this year, since I missed the last two – this may seem like a silly thing to complain about, considering the reason I missed them is that I was in actual Ireland, but it’s not like they have trad festivals there all the time. Still, a virtual festival is surely better than no festival at all, and the programme looks very promising, so I’ll definitely be tuning in, and I think I’ll buy the €5 optional virtual ticket as well, to support the cause. 

You’ve changed, man

I’ve been back at work after my summer vacation for about a month now, so I guess it’s about time I got back into blogging as well. Not that there’s a whole lot of news – I’m still doing the vast majority of my work in my living room and only visiting the campus sporadically. Frankly, I would have expected things to be closer to normal by now, but perhaps we first need to figure out what is normal anyway (hat tip to The Hitchhiker’s Guide to the Galaxy). The university’s playing it safe and recommending not just working remotely but also wearing a mask now, if you’re going to come to the campus and do anything other than sit in your office. My closest colleagues and I are doing our best to keep the social group tight: constant WhatsApp chatter, weekly lunches and virtual coffee mornings, the occasional face-to-face meeting. 

Naturally working remotely means that we’ll also be teaching remotely, which affects me since we’re running our Towards Data Mining course in period 1. While I was in Ireland, my lecture – a hodgepodge of ethics, data security and data management topics – was handled by a colleague, but when I came back this year I took over from her again. The aforementioned colleague also recorded my part of the series of lecture videos used in lieu of live lectures when we ran the course in the spring term, so I was there basically just to mark exercise reports and exam answers. In the autumn term we were planning to lecture the course the traditional way, but now that that’s not an option, we’re going to present the lectures on Zoom instead. 

I’ve said before that I’m not overly keen on lecturing, and I’m not at all sure if doing it online will make things better or worse. On the one hand, I suppose it should be easier to stay relaxed when I can do the lecture from the comfort of my home, but on the other hand, I think it may feel somewhat unnatural to be addressing an audience while essentially talking to myself, unable to gauge if the students are paying any attention to what I’m saying. Online meetings I’ve grown used to, but those are much more interactive and therefore not really the same thing. It doesn’t exactly help that I haven’t given that lecture in three years, so that would add to my nervousness even if nothing had changed in the meantime. 

The new AI ethics course has taken a step forward, a formal proposal for a pilot run next spring has been prepared and submitted to the Faculty. With the two courses plus a bunch of Master’s theses to supervise, I feel like my job has recently been more about teaching than about research. Not that I mind, really – it’s all meaningful work, and all part of why I’ve always held universities in very high esteem as places devoted to the creation, curation and distribution of the best of human knowledge. Obviously teaching and research require substantially different skill sets and therefore being good at one does not imply being good at the other, but that doesn’t mean it’s a good idea to treat these core functions of a university as if they are two completely separate domains rather than two sides of the same coin. 

When I started this blog, I said its theme would be knowledge, and I seem to have circled back to that even though I wasn’t really planning to. I’m a firm believer in the intrinsic value of knowledge, and passing on the knowledge you have is an essential part of maximising that value, just as important as creating new knowledge. On a more personal and subjective level, I’ve always found great joy in learning or figuring out things I didn’t know before, and if I can help others feel that same joy, so much the better. I still doubt that I’d be very happy in an all-teaching role, but I’ve come to view teaching as a natural part of the job, something I can find satisfaction in and also something I can make a steady contribution in while research has its ups and downs. It’s not that many years ago that I saw teaching mainly as a nuisance to be avoided, so I guess it’s fair to say I’ve changed!