The NGI Policy Summit was a great opportunity for policymakers, innovators and researchers to come together to start laying out a European vision for the future internet and elaborate the policy interventions and technical solutions that can help get us there.
As part of the Summit, Nesta Italia and Impactscool hosted a futures workshop exploring the key design choices for the future internet. It was a participative and thought-provoking session. Here we take a look at how it went.
The discussion about the internet of the future is very complex and it touches on many challenges that our societies are facing today. Topics like Data sovereignty, Safety, Privacy, Sustainability, Fairness, just to name a few, as well as the implications of new technologies such as AI and Blockchain, and areas of concern around them, such as Ethics and Accessibility.
In order to define and build the next generation internet, we need to make a series of design choices guided by the European values we want our internet to radiate. However, moving from principles to implementation is really hard. In fact, we face the added complexity coming from the interaction between all these areas and the trade-offs that design choices force us to make.
Our workshop’s goal was to bring to life some of the difficult decisions and trade-offs we need to consider when we design the internet of the future, in order to help us reflect on the implications and interaction of the choices we make today.
How we did it
The workshop was an immersive simulation about the future in which we asked the participants to make some key choices about the design of the future internet and then deep dived into possible future scenarios emerging from these choices.
The idea is that it is impossible to know exactly what the future holds, but we can explore different models and be open to many different possibilities, which can help us navigate the future and make more responsible and robust choices today.
In practice, we presented the participants with the following 4 challenges in the form of binary dilemmas and asked them to vote for their preferred choice with a poll:
Data privacy: protection of personal data vs data sharing for the greater good
Algorithms: efficiency vs ethics
Systems: centralisation vs decentralisation
Information: content moderation vs absolute freedom
For each of the 16 combinations of binary choices we prepared a short description of a possible future scenario, which considered the interactions between the four design areas and aimed at encouraging reflection and discussion.
Based on the majority votes we then presented the corresponding future scenario and discussed it with the participants, highlighting the interactions between the choices and exploring how things might have panned out had we chosen a different path.
Protection of personal data 84%
Data sharing for the greater good 16%
Content moderation 41%
Absolute freedom 59%
The table above summarises the choices made by the participants during the workshop, which led to the following scenario.
Decentralized and distributed points of access to the internet make it easier for individuals to manage their data and the information they are willing to share online.
Everything that is shared is protected and can be used only following strict ethical principles. People can communicate without relying on big companies that collect data for profit. Information is totally free and everyone can share anything online with no filters.
Not so one-sided
Interesting perspectives emerged when we asked contrarian opinions on the more one-sided questions, which demonstrated how middle-ground and context-aware solutions are required in most cases when dealing with complex topics as those analysed.
We discussed how certain non-privacy-sensitive data can genuinely contribute to the benefit of society, with minimum concern on the side of the individual if they are shared in anonymised form. Two examples that emerged from the discussion were transport management and research. In discussing the (de)centralisation debate, we discussed how decentralisation could result in a diffusion of responsibility and lack of accountability. “If everyone’s responsible, nobody is responsible”. We mentioned how this risk could be mitigated thanks to tools like Public-Private-People collaboration and data cooperatives, combined with clear institutional responsibility.
How collective intelligence can help tackle major challenges…
...and build a better internet along the way!
by Aleks Berditchevskaia, Markus Droemann
It’s hard to imagine what our social response to a public health challenge at the scale of COVID-19 would have looked like just ten or fifteen years ago – in a world without sophisticated tools for remote working, diversified digital economies, and social networking opportunities.
The common enabler of all these activities is the internet. Recent years have seen innovation across all of its layers – from infrastructure to data rights – resulting in an unprecedented capacity for people to work together, share skills and pool information to understand how the world around them is changing and respond to challenges. This enhanced capacity is known as collective intelligence (CI).
The internet certainly needs fixing – from the polarising effect of social media on political discourse to the internet’s perpetual concentration of wealth and power and its poorly understood impact on the environment. But turning to the future, it’s equally clear that there is great promise in the ability of emerging technologies, new governance models and infrastructure protocols to enable entirely new forms of collective intelligence that can help us solve complex problems and change our lives for the better.
Based on examples from Nesta’s recent report,The Future of Minds & Machines, this blog shows how an internet based on five core values can serve to combine distributed human and machine intelligence in new ways and help Europe become more than the sum of its parts.
Resilience is a core value for the future internet. It means secure infrastructure and the right balance between centralisation and decentralisation. But it also means that connected technologies should enable us to better respond to external challenges. Online community networks that can be tapped into and mobilised quickly are already an important part of the 21st century humanitarian response.
Both Amnesty Internationaland Humanitarian OpenStreetMap have global communities of volunteers, numbering in the thousands, who participate in distributed micromapping efforts to trace features like building and roads on satellite images. These online microtasking platforms help charities and aid agencies understand how conflicts and environmental disasters affect different regions around the world, enabling them to make more informed decisions about distribution of resources and support.
More recently, these platforms have started to incorporate elements of artificial intelligence to support the efforts of volunteers. One such initiative, MapWithAI, helps digital humanitarians to prioritise where to apply their skills to make mapping more efficient overall.
The internet also enables and sustains distinct communities of practice, like these groups of humanitarian volunteers, allowing individuals with similar interests to find each other. This social and digital infrastructure may prove invaluable in times of crises, when there is a need to tap into a diversity of skills and ideas to meet unexpected challenges.
One example of collective intelligence improving inclusiveness – while also taking an inclusive-by-design approach – is Mozilla’s Common Voice project, which uses an accessible online platform to crowdsource the world’s largest open dataset of diverse voice recordings, spanning different languages, demographic backgrounds and accents.
Ensuring diversity of contributions is not easy. It requires a deliberate effort to involve individuals with rare knowledge, such as members of indigenous cultures or speakers of unusual dialects. But a future internet built around an inclusive innovation ecosystem, products that are inclusive-by-design, and fundamental rights for the individual – rather than a closed system built around surveillance and exploitation – will make it easier for projects like Common Voice to become the norm.
The future internet should have the ambition to protect democratic institutions and give political agency to all – but it should also itself be an expression of democratic values. That means designing for more meaningful bottom-up engagement of citizens, addressing asymmetric power relationships in the digital economy and creating spaces for different voices to be heard.
Both national and local governments worldwide are starting to appreciate the opportunities that the internet and collective intelligence offer in terms of helping them to better understand the views of their citizens. Parliaments from Brazil to Taiwan are inviting citizens to contribute to the legislative process, while cities like Brussels and Paris are asking their residents to help prioritise spending through participatory budgeting. The EU is also preparing a Conference on the Future Europe to engage citizens at scale in thinking about the future of the bloc, an effort that could be enhanced and facilitated through CI-based approaches like participatory futures. These types of activities can help engage a greater variety of individuals in political decision-making and redefine the relationships between politicians and the constituents they serve.
Unfortunately, some citizen engagement initiatives are still driven by tech-solutionism without a clear market need, rather than the careful design of participation processes that make the most of the collective contributions of citizens. Even when digital democracy projects start out with the best intentions politicians can struggle to make sense of this new source of insight, which risks valuable ideas being overlooked and diminished trust in democratic processes.
There are signs that this is changing. For example, the collective intelligence platform Citizen Lab is trying to optimise the channels of communications and interpretation between citizens and politicians. It has started to apply natural language processing algorithms to help organise and identify themes in the ideas that citizens contribute using its platform, helping public servants to make better use of them. Citizen Lab is used by city administrations in more than 20 countries across Europe and offers a glimpse of how Europe can set an example of democratic collective intelligence enabled by the infrastructure of the internet.
A closely related challenge for the internet today is the continued erosion of trust – trust in the veracity of information, trust between citizens online, and trust in public institutions. The internet of the future will have to find ways of dealing with challenges like digital identities and the safety of our everyday online interactions. But perhaps most importantly, the internet must be able to tackle the problems of information overload and misinformation through systems that optimise for fact-based and balanced exchanges, rather than outrage and division.
We have seen some of the dangers of fake news manifest as part of the response to COVID-19. At a time when receiving accurate public health messaging and government communications are a matter of life and death, the cacophony of information on the internet can make it hard for individuals to distinguish the signal from the noise.
Undoubtedly, part of the solution to effectively navigate his new infosphere will require new forms of public private partnerships. By working with media and technology giants like Facebook and Twitter, governments and health agencies worldwide have started to curb some of the negative effects of misinformation in the wake of the coronavirus pandemic. But the commitment to a trustworthy internet is a long-term investment. It will not only rely on the actions of policy makers and industry to develop recognisable trustmarks, but also on a more literate citizenry that is better able to spot suspicious materials and flag concerns.
Many existing fact checking projects already already use crowdsourcing at different stages of the verification processes. For example, the company Factmata is developing a technology that will draw on specialist communities of more than 2000 trained experts to help them assess the trustworthiness of online content. However, crowdsourced solutions can be vulnerable to issues of bias, polarisation and gaming and will need to be consolidated by complementary sources of intelligence such as expert validation or entirely new AI tools that can help to mitigate against the effects of social bias.
Undoubtedly, some of our biggest challenges are yet to come. But the internet holds untapped potential for us to build awareness for the interdependency of our social and natural environments. We need to champion models that put the digital economy at the service of creating a more sustainable planet and combating climate change, while also remaining conscious of the environmental footprint these systems have in their own right.
Citizen science is a distinct family of collective intelligence methods where volunteers collect data, make observations or perform analyses that helps to advance scientific knowledge. Citizen science projects have proliferated over the last 20 years, in large part due to the internet. For example, the most popular online citizen science platform,Zooniverse, hosts over 50 different scientific projects and has attracted over 1 million contributors.
A large proportion of citizen science projects focus on the environment and ecology, helping to engage members of the public outside of traditional academia with issues such as biodiversity, air quality and pollution of waterways. iNaturalist is an online social network that brings together nature lovers to keep track of different species of plants and animals worldwide. The platform supports learning within a passionate community and creates a unique open data source that can be used by scientists and conservation agencies.
Building the Next Generation Internet – with and for collective intelligence
To enable next-generation collective intelligence, Europe needs to look beyond ‘just AI’ and invest in increasingly smarter ways of connecting people, information and skills, and facilitating interactions on digital platforms. The continued proliferation of data infrastructures, public and private sector data sharing and the emergence of the Internet of Things will play an equally important part in enhancing and scaling up collective human intelligence. Yet, for this technological progress to have a transformative and positive impact on society, it will have to be put in the service of furthering fundamental values. Collective intelligence has the opportunity to be both a key driver and beneficiary of a more inclusive, resilient, democratic, sustainable and trustworthy internet.
At this moment of global deceleration, we suggest it is time to take stock of old trajectories for the internet to set out on a new course, one that allows us to make the most of the diverse collective intelligence that we have within society to become better at solving complex problems. The decisions we make today will help us to shape the society of the future.
Aleks is a Senior Researcher and Project Manager for Nesta’s Centre for Collective Intelligence Design (CCID). The CCID conducts research and develops resources to help innovators understand how they can harness collective intelligence to solve problems. Our latest report, The Future of Minds & Machines mapped the various ways that AI is helping to enhance and scale the problem solving abilities of groups. It is available for download on the Nesta website, where you can also explore 20 case studies of AI & CI in practice.
Mapping the tech world with text-mining: Part 1.
Introduction As part of the NGI Forward project, DELab UW is supporting the European Commission’s Next Generation Internet initiative with identifying emerging technologies and social issues related to the Internet. Our team has been experimenting with various natural language processing methods to discover trends and hidden patterns in different types of online media. You may find our tools and presentations at https://fwd.delabapps.eu. […]
by Kristóf Gyódi, Łukasz Nawaro, Michał Paliński, Maciej Wilamowski
As part of the NGI Forward project, DELab UW is supporting the European Commission’s Next Generation Internet initiative with identifying emerging technologies and social issues related to the Internet. Our team has been experimenting with various natural language processing methods to discover trends and hidden patterns in different types of online media. You may find our tools and presentations at https://fwd.delabapps.eu.
This series of blog posts presents the results of our latest analysis of technology news. We have two main goals:
to discover the most important topics in news discussing emerging technologies and social issues,
to map the relationship between these topics.
Our text mining exercises are based on a technology news data set that consists of 213 000 tech media articles. The data has been collected for a period of 40 months (between 2016-01-01 and 2019-04-30), and includes the plain text of articles. As the figure shows, the publishers are based in the US, the UK, Belgium and Australia. More information on the data set is available at a Zenodo repository.
In this first installment, we focus on a widely used text-mining method: Latent Dirichlet Allocation (LDA). LDA gained its popularity due to its ease of use, flexibility and interpretable results. First, we briefly explain the basics of the algorithm for all non-technical readers. In the second part of the post, we show what LDA can achieve with a sufficiently large data set.
Text data is high-dimensional. In its most basic – but comprehensible to computers – form, it is often represented as a bag-of-words (BOW) matrix, where each row is a document and each column contains a count how often a word occurs in the documents. These matrices are transformable by linear algebra methods to discover the hidden (latent and lower-dimensional) structure in it.
Topic modeling assumes that documents, such as news articles, contain various distinguishable topics. As an example, a news article covering the Cambridge Analytica scandal may contain the following topics: social media, politics and tech regulations, with the following relations: 60% social media, 30% politics and 10% tech regulations. The other assumption is that topics contain characteristic vocabularies, e.g. the social media topic is described by the words Facebook, Twitter etc.
LDA has been proposed by Blei et al. (2003), based on Bayesian statistics. The method’s name provides its key foundations. Latent comes from the assumption that documents contain latent topics that we do not know a priori. Allocation shows that we allocate words to topics, and topics to documents. Dirichlet is a multinomial likelihood distribution: it provides the joint distribution of any number of outcomes. As an example, Dirichlet distribution can describe the occurrences of observed species in a safari (Downey, 2013). In LDA, it describes the distribution of topics in documents, and the distribution of words in topics.
The basic mechanism behind topic modeling methods is simple: assuming that documents can be described by a limited number of topics, we try to recreate our texts from a combination of topics that consist of characteristic words. More precisely, we aim at recreating our BOW word-document matrix with the combination of two matrices: the matrix containing the Dirichlet distribution of topics in documents (topic-document matrix), and the matrix containing the words in topics (word-topic matrix). The construction of the final matrices is achieved by a process called Gibbs sampling. The idea behind Gibbs sampling is to introduce changes into the two matrices word-by-word: change the topic allocation of a selected word in a document, and evaluate if this change improves the decomposition of our document. Repeating the steps of the Gibbs sampling in all documents provides the final matrices that provide the best description of the sample.
For more details on topic modelling, we recommend this and this excellent posts. For the full technical description of this study, head to our full report.
The most important parameter of topic modelling is the number of topics. The main objective is to obtain a satisfactory level of topic separation, i.e. a situation in which topics are neither all issues lumped together nor overly fragmented ones. In order to achieve that, we have experimented with different LDA hyper parameters levels. For settings with 20 topics, the topics were balanced and separable.
Therefore, we identified 20 major topics that are presented in the visualisation below. Each circle represents a topic (the size reflects the topic’s prelevance in the documents), with distances determined by the similarity of vocabularies: topics sharing the same words are closer to each other. In the right panel, the bars represent the individual terms that are characteristic for the currently selected topic on the left. A pair of overlapping bars represent both the corpus-wide frequency of a given term, as well as its topic-specific frequency. We managed to reach gradually decreasing topic sizes: the largest topic has a share of 19%, the 5th 8%, and the 10th 5%.
After studying these most relevant terms, we labeled each topic with the closest umbrella term. Upon closer examination, we have reduced the number of topics to 18 (topics 5 & 16 became the joined category Space tech, while topics 10 & 19 were melded together to form a topic on Online streaming). In the following sections we provide brief descriptions of the identified topics.
AI & robots
AI & robots constitutes the largest topic containing around 19% of all tokens and is characterized by machine learning jargon (e.g. train data) as well as popular ML applications (robots, autonomous cars).
Social media crisis
Social media topic is similarly prevalent and covers contentious aspects of modern social media platforms (facebook, twitter) as right to privacy, content moderation, user bans or election meddling with the use of microtargeting (i.a.: privacy, ban, election, content, remove).
A large share of tech articles cover business news, especially on major platforms (uber, amazon), services such as cloud computing (aws) or emerging technologies as IoT or blockchain. The topic words also suggest great focus on the financial results of tech companies (revenue, billion, sale, growth).
Topic 4 covers articles about the $522B smartphone market. Two major manufacturers – Samsung and Apple are on the top of the keyword list with equal number of appearances. Articles are focused on the features, parameters and additional services provided by devices (camera, display, alexa etc.).
Space exploration excitement is common in the tech press. Topic 5 contains reports about NASA, future Mars and Moon mission as well as companies working on space technologies, such as SpaceX.
Topic 6 revolves around Cambridge Analytica privacy scandal and gathers all mentions of this keyword in the corpus. The involvement of Cambridge Analytica in the Leave campaign during the Brexit referendum is of major focus, as suggested by the high position of keywords such as eu and uk. Unsurprisingly, GDPR is also often mentioned in the articles dealing with the aftermath of CA’ controversy.
Topic 7 pertains to cyberspace security issues. It explores subjects of malware and system vulnerabilities targeting both traditional computer systems, as well as novel decentralized technologies based on blockchain.
The much anticipated fifth-generation wireless network has huge potential to transform all areas with an ICT component. Topic 8 deals with global competition over delivering 5G tech to the market (huawei, ericsson). It captures also the debate about 5G’s impact on net neutrality. 5G’s main quality is to enable signal ‘segmentation’, causing debate whether it can be treated like previous generations of mobile communications by net neutrality laws.
The focus of Topic 9 is on operating systems, both mobile (ios, androi), desktop, (windows, macos) as well as dedicated services (browsers chrome, mozilla) and app stores (appstore).
Topic 10 revolves around the most important media platforms: streaming and social media. The global video streaming market size was valued at around USD 37B in 2018, music streaming adds another 9B to this number and account for nearly half of the music industry revenue. Particularly, this topic focuses on major streaming platforms (youtube, netflix, spotify), social media (facebook, instagram, snapchat), the rising popularity of podcasts and business strategies of streaming services (subscriptions, ads).
During its 40 year history, Microsoft has made above 200 acquisitions. Some of them were considered to be successful (e.g. LinkedIn, Skype), while others were less so… (Nokia). Topic 11 collects articles describing Microsoft finished, planned and failed acquisitions in the recent years (github, skype, dropbox, slack).
Autonomous transportation is a vital point of public debate. Policy makers should consider whether to apply subsidies or taxes to equalize the public and private costs and benefits of this technology. AV technology offers the possibility of significant benefits to social welfare — saving lives; reducing crashes, congestion, fuel consumption, and pollution; increasing mobility for the disabled; and ultimately improving land use (RAND, 2016). Topic 12 addresses technological shortcomings of the technology (batteries) as well as positive externalities such as lower emissions (epa, emissions).
LDA modelling has identified Tesla and other Elon Musk projects as a separate topic. Besides Tesla developments of electric and autonomous vehicles, the topic also includes words related to other mobility solutions (e.g. Lime).
CPU and other hardware
Topics 14 & 15 are focused on hardware. Topic 14 covers CPU innovation race between Intel and AMD, as well as the Broadcom-Qualcomm acquisition saga, blocked by Donald Trump due to national security concerns. Topic 15 includes news regarding various standards (usb-c), storage devices (ssd) etc.
Topic 17 concentrates on startup ecosystems and crowdsource financing. Articles discuss major startup competitions such as Startup Battlefield or Startup Alley, and crowdfunding services such as Patreon.
We observe a surge in the adoption of wearables, such as fitness trackers, smart watches or augmented and virtual reality headsets. This trend brings important policy questions. On the one hand, wearables offer tremendous potential when it comes to monitoring health. On the other hand, it might be overshadowed with concerns about user privacy and access to personal data. Articles in topic 18 discuss news from the wearable devices world regarding new devices, novel features etc. (fitbit, heartrate).
Topic 20 deals with gaming industry. It covers inter alia popular games (pokemon), gaming platforms (nintendo), various game consoles (switch) and game expos (e3).
We provided a bird’s eye view on the technology world with topic modelling. Topic modelling serves as an appropriate basis for exploring broad topics, such as the social media crisis, AI or business news. At this stage, we were able to identify major umbrella topics that ignite the public debate.
In the next post, we will introduce another machine learning method: t-SNE. With the help of this algorithm, we will create a two-dimensional map of the news, where articles covering the same topic will be neighbours. We will also show how t-SNE can be combined with LDA.
When mind meets machine: harnessing collective intelligence for Europe
Collective intelligence (CI) has emerged in the last few years as a new field, prompted by a wave of digital technologies that make it possible for organisations and societies to harness the intelligence of many people, and things, on a huge scale. It is a rapidly evolving area, encompassing everything from citizen science to open […]
by Harry Armstrong, Peter Beack
Collective intelligence (CI) has emerged in the last few years as a new field, prompted by a wave of digital technologies that make it possible for organisations and societies to harness the intelligence of many people, and things, on a huge scale.
It is a rapidly evolving area, encompassing everything from citizen science to open innovation to the potential use of data trusts, and offers enormous new opportunities in fields like sustainability, health and democracy. For Europe, harnessing CI will be critical to achieving its economic and social goals through initiatives like the Green New Deal or Next Generation Internet.
A quick primer
Collective intelligence is created when people work together, often with the help of technology, to mobilise a wider range of information, ideas and insights to address a social challenge.
As an idea, it isn’t new. It’s based on the theory that groups of diverse people are collectively smarter than any single individual on their own. The premise is that intelligence is distributed. Different people hold different pieces of information and different perspectives that, when combined, create a more complete picture of a problem and how to solve it. The intelligence of the crowd can be further augmented by combining these insights with data analytics and Artificial Intelligence (AI). Bringing these two elements can be extremely powerful but the field is still emerging and it isn’t always clear how to do it well.
Nesta’s Centre for Collective Intelligence Design is at the forefront of both research and practice in this space, and has recently developed a Collective Intelligence Playbook to support others to harness CI more effectively.
To explore the opportunities for Europe and the European Commission’s NGI initiative further, Nesta held a workshop as part of the MyData event in Helsinki in September. In this workshop, we introduced participants to the concept of collective intelligence and asked: how can we best combine the intelligence of the crowd and artificial intelligence to solve some of today’s largest societal problems?
During the workshop Peter Baeck, Head of the Centre for Collective Intelligence Design at Nesta and Aleks Berditchevskaia, Senior Researcher on collective intelligence explained the concept of CI and then took workshop participants through the Collective Intelligence Toolkit developed by Nesta.
Katja Henttonen, project manager in e-democracy for Helsinki, provided a live case study introduction to CI in practice through a demonstration of the Decidim online democracy and participatory budgeting tools currently being trialled in the city.
Using the collective intelligence toolkit canvas and method prompt cards the groups were given an hour to work on a number of practical problem statements, exploring several challenges in the internet space, opportunities around collective intelligence were explored, as well as questions about how CI can be practically used at scale, learning from exciting case studies from around the world.
What did we learn from the workshop?
Workshop participants saw huge potential in using CI and the toolkit In particular participants were excited to be introduced to new methods such as citizen science or using satellite data for collective intelligence.
There is a clear need for better practical guidance, such as Nesta’s playbook, on what CI is and how it can be applied by organisations. Workshop participants suggested this could be done through further development of practical tools and guides for how to design for CI and the creation of open repositories or data bases on CI methods and use cases. Within this, participants highlighted the need to make the support for CI as practical as possible and suggested connecting any research, investment and support for CI to specific social challenges, such as climate change, fake news or digital democracy.
Of the different tools and methods to enable CI, the workshop highlighted a particular interest in understanding the relationship between human and machine intelligence in enabling different forms of CI and raised three challenges/questions:
Better understanding of the different functions in the relationship between human and machine intelligence and how to design solutions that tap into the benefits of these while maintaining strong ethical frameworks and give individuals control of the data they want to contribute to the collective and how this can be used.
Knowledge on how AI enabled CI can be applied and used within grassroots networks and NGO’s to better mobilise volunteers, activists and community group to identify and solve common challenges is needed.
Funding that explicitly focuses on bringing together the AI community with the CI community is needed to foster new forms of collaboration.