Creating (positive) friction in AI procurement

I had the opportunity to participate in the Inaugural AI Commercial Lifecycle and Procurement Summit 2024 hosted by Curshaw. This was a very interesting ‘unconference’ where participants offered to lead sessions on topics they wanted to talk about. I led a session on ‘Creating friction in AI procurement’.

This was clearly a counterintuitive way of thinking about AI and procurement, given that the ‘big promise’ of AI is that it will reduce friction (eg through automation, and/or delegation of ‘non-value-added’ tasks). Why would I want to create friction in this context?

The first clarification I was thus asked for was whether this was about ‘good friction’ (as opposed to old bad ‘red tape’ kind of friction), which of course it was (?!), and the second, what do I mean by friction.

My recent research on AI procurement (eg here and here for the book-long treatment) has led me to conclude that we need to slow down the process of public sector AI adoption and to create mechanisms that bring back to the table the ‘non-AI’ option and several ‘stop project’ or ‘deal breaker’ trumps to push back against the tidal wave of unavoidability that seems to dominate all discussions on public sector digitalisation. My preferred solution is to do so through a system of permissioning or licencing administered by an independent authority—but I am aware and willing to concede that there is no political will for it. I thus started thinking about second-best approaches to slowing public sector AI procurement. This is how I got to the idea of friction.

By creating friction, I mean the need for a structured decision-making process that allows for collective deliberation within and around the adopting institution, and which is supported by rigorous impact assessments that tease out second and third order implications from AI adoption, as well as thoroughly interrogating first order issues around data quality and governance, technological governance and organisational capability, in particular around risk management and mitigation. This is complementary—but hopefully goes beyond—emerging frameworks to determine organisational ‘risk appetite’ for AI procurement, such as that developed by the AI Procurement Lab and the Centre for Inclusive Change.

The conversations the focus on ‘good friction’ moved in different directions, but there are some takeaways and ideas that stuck with me (or I managed to jot down in my notes while chatting to others), such as (in no particular order of importance or potential):

  • the potential for ‘AI minimisation’ or ‘non-AI equivalence’ to test the need for (specific) AI solutions—if you can sufficiently approximate, or replicate, the same functional outcome without AI, or with a simpler type of AI, why not do it that way?;

  • the need for a structured catalogue of solutions (and components of solutions) that are already available (sometimes in open access, where there is lots of duplication) to inform such considerations;

  • the importance of asking whether procuring AI is driven by considerations such as availability of funding (is this funded if done with AI but not funded, or hard to fund at the same level, if done in other ways?), which can clearly skew decision-making—the importance of considering the effects of ‘digital industrial policy’ on decision-making;

  • the power (and relevance) of the deceptively simple question ‘is there an interdisciplinary team to be dedicated to this, and exclusively to this’?;

  • the importance of knowledge and understanding of the tech and its implications from the beginning, and of expertise in the translation of technical and governance requirements into procurement requirements, to avoid ‘games of chance’ whereby the use of ‘trendy terms’ (such as ‘agile’ or ‘responsible’) may or may not lead to the award of the contract to the best-placed and best-fitting (tech) provider;

  • the possibility to adapt civic monitoring or social witnessing mechanisms used in other contexts, such as large infrastructure projects, to be embedded in contract performance and auditing phases;

  • the importance of understanding displacement effects and whether deploying a solution (AI or automation, or similar) to deal with a bottleneck will simply displace the issue to another (new) bottleneck somewhere along the process;

  • the importance of understanding the broader organisational changes required to capture the hoped for (productivity) gains arising from the tech deployment;

  • the importance of carefully considering and resourcing the much needed engagement of the ‘intelligent person’ that needs to check the design and outputs of the AI, including frontline workers and those at the receiving end of the relevant decisions or processes and the affected communities—the importance of creating meaningful and effective deliberative engagement mechanisms;

  • relatedly, the need to ensure organisational engagement and alignment at every level and every step of the AI (pre)procurement process (on which I would recommend reading this recent piece by Kawakami and colleagues);

  • the need to assess the impacts of changes in scale, complexity, and error exposure;

  • the need to create adequate circuit-breakers throughout the process.

Certainly lots to reflect on and try to embed in future research and outreach efforts. Thanks to all those who participated in the conversation, and to those interested in joining it. A structured way to do so is through this LinkedIn group.

Meaning, AI, and procurement -- some thoughts

©Ausrine Kuze, Distorted Reality, 2021.

James McKinney and Volodymyr Tarnay of the Open Contracting Partnership have published ‘A gentle introduction to applying AI in procurement’. It is a very accessible and helpful primer on some of the most salient issues to be considered when exploring the possibility of using AI to extract insights from procurement big data.

The OCP introduction to AI in procurement provides helpful pointers in relation to task identification, method, input, and model selection. I would add that an initial exploration of the possibility to deploy AI also (and perhaps first and foremost) requires careful consideration of the level of precision and the type (and size) of errors that can be tolerated in the specific task, and ways to test and measure it.

One of the crucial and perhaps more difficult to understand issues covered by the introduction is how AI seeks to capture ‘meaning’ in order to extract insights from big data. This is also a controversial issue that keeps coming up in procurement data analysis contexts, and one that triggered some heated debate at the Public Procurement Data Superpowers Conference last week—where, in my view, companies selling procurement insight services were peddling hyped claims (see session on ‘Transparency in public procurement - Data readability’).

In this post, I venture some thoughts on meaning, AI, and public procurement big data. As always, I am very interested in feedback and opportunities for further discussion.

Meaning

Of course, the concept of meaning is complex and open to philosophical, linguistic, and other interpretations. Here I take a relatively pedestrian and pragmatic approach and, following the Cambridge dictionary, consider two ways in which ‘meaning’ is understood in plain English: ‘the meaning of something is what it expresses or represents’, and meaning as ‘importance or value’.

To put it simply, I will argue that AI cannot capture meaning proper. It can carry complex analysis of ‘content in context’, but we should not equate that with meaning. This will be important later on.

AI, meaning, embeddings, and ‘content in context’

The OCP introduction helpfully addresses this issue in relation to an example of ‘sentence similarity’, where the researchers are looking for phrases that are alike in tender notices and predefined green criteria, and therefore want to use AI to compare sentences and assign them a similarity score. Intuitively, ‘meaning’ would be important to the comparison.

The OCP introduction explains that:

Computers don’t understand human language. They need to operate on numbers. We can represent text and other information as numerical values with vector embeddings. A vector is a list of numbers that, in the context of AI, helps us express the meaning of information and its relationship to other information.

Text can be converted into vectors using a model. [A sentence transformer model] converts a sentence into a vector of 384 numbers. For example, the sentence “don’t panic and always carry a towel” becomes the numbers 0.425…, 0.385…, 0.072…, and so on.

These numbers represent the meaning of the sentence.

Let’s compare this sentence to another: “keep calm and never forget your towel” which has the vector (0.434…, 0.264…, 0.123…, …).

One way to determine their similarity score is to use cosine similarity to calculate the distance between the vectors of the two sentences. Put simply, the closer the vectors are, the more alike the sentences are. The result of this calculation will always be a number from -1 (the sentences have opposite meanings) to 1 (same meaning). You could also calculate this using other trigonometric measures such as Euclidean distance.

For our two sentences above, performing this mathematical operation returns a similarity score of 0.869.

Now let’s consider the sentence “do you like cheese?” which has the vector (-0.167…, -0.557…, 0.066…, …). It returns a similarity score of 0.199. Hooray! The computer is correct!

But, this method is not fool-proof. Let’s try another: “do panic and never bring a towel” (0.589…, 0.255…, 0.0884…, …). The similarity score is 0.857. The score is high, because the words are similar… but the logic is opposite!

I think there are two important observations in relation to the use of meaning here (highlighted above).

First, meaning can hardly be captured where sentences with opposite logic are considered very similar. This is because the method described above (vector embedding) does not capture meaning. It captures content (words) in context (around other words).

Second, it is not possible to fully express in numbers what text expresses or represents, or its importance or value. What the vectors capture is the representation or expression of such meaning, the representation of its value and importance through the use of those specific words in the particular order in which they are expresssed. The string of numbers is thus a second-degree representation of the meaning intended by the words; it is a numerical representation of the word representation, not a numerical representation of the meaning.

Unavoidably, there is plenty scope for loss, alteration or even inversion of meaning when it goes through multiple imperfect processes of representation. This means that the more open textured the expression in words and the less contextualised in its presentation, the more difficult it is to achieve good results.

It is important to bear in mind that the current techniques based on this or similar methods, such as those based on large language models, clearly fail on crucial aspects such as their factuality—which ultimately requires checking whether something with a given meaning is true or false.

This is a burgeoning area of technnical research but it seems that even the most accurate models tend to hover around 70% accuracy, save in highly contextual non-ambiguous contexts (see eg D Quelle and A Bovet, ‘The perils and promises of fact-checking with large language models’ (2024) 7 Front. Artif. Intell., Sec. Natural Language Processing). While this is an impressive feature of these tools, it can hardly be acceptable to extrapolate that these tools can be deployed for tasks that require precision and factuality.

Procurement big data and ‘content and context’

In some senses, the application of AI to extract insights from procurement big data is well suited to the fact that, by and large, existing procurement data is very precisely contextualised and increasingly concerns structured content—that is, that most of the procurement data that is (increasingly) available is captured in structured notices and tends to have a narrowly defined and highly contextual purpose.

From that perspective, there is potential to look for implementations of advanced comparisons of ‘content in context’. But this will most likely have a hard boundary where ‘meaning’ needs to be interpreted or analysed, as AI cannot perform that task. At most, it can help gather the information, but it cannot analyse it because it cannot ‘understand’ it.

Policy implications

In my view, the above shows that the possibility of using AI to extract insights from procurement big data needs to be approched with caution. For tasks where a ‘broad brush’ approach will do, these can be helpful tools. They can help mitigate the informational deficit procurement policy and practice tend to encounter. As put in the conference last week, these tools can help get a sense of broad trends or directions, and can thus inform policy and decision-making only in that regard and to that extent. Conversely, AI cannot be used in contexts where precision is important and where errors would affect important rights or interests.

This is important, for example, in relation to the fascination that AI ‘business insights’ seems to be triggering amongst public buyers. One of the issues that kept coming up concerns why contracting authorities cannot benefit from the same advances that are touted as being offered to (private) tenderers. The case at hand was that of identifying ‘business opportunities’.

A number of companies are using AI to support searches for contract notices to highlight potentially interesting tenders to their clients. They offer services such as ‘tender summaries’, whereby the AI creates a one-line summary on the basis of a contract notice or a tender description, and this summary can be automatically translated (eg into English). They also offer search services based on ‘capturing meaning’ from a company’s website and matching it to potentially interesting tender opportunities.

All these services, however, are at bottom a sophisticated comparison of content in context, not of meaning. And these are deployed to go from more to less information (summaries), which can reduce problems with factuality and precision except in extreme cases, and in a setting where getting it wrong has only a marginal cost (ie the company will set aside the non-interesting tender and move on). This is also an area where expectations can be managed and where results well below 100% accuracy can be interesting and have value.

The opposite does not apply from the perspective of the public buyer. For example, a summary of a tender is unlikely to have much value as, with all likelihood, the summary will simply confirm that the tender matches the advertised object of the contract (which has no value, differently from a summary suggesting a tender matches the business activities of an economic operator). Moreover, factuality is extremely important and only 100% accuracy will do in a context where decision-making is subject to good administration guarantees.

Therefore, we need to be very careful about how we think about using AI to extract insights from procurement (big) data and, as the OCP introduction highlights, one of the most important things is to clearly define the task for which AI would be used. In my view, there are much more limited tasks than one could dream up if we let our collective imagination run high on hype.

Did you use AI to write this tender? What? Just asking! -- Also, how will you use AI to deliver this contract?

The UK’s Cabinet Office has published procurement policy note 2/24 on ‘Improving Transparency of AI use in Procurement’ (the ‘AI PPN’) because ‘AI systems, tools and products are part of a rapidly growing and evolving market, and as such, there may be increased risks associated with their adoption … [and therefore] it is essential to take steps to identify and manage associated risks and opportunities, as part of the Government’s commercial activities’.

The crucial risk the AI PPN seems to be concerned with relates to generative AI ‘hallucinations’, as it includes background information highlighting that:

‘Content created with the support of Large Language Models (LLMs) may include inaccurate or misleading statements; where statements, facts or references appear plausible, but are in fact false. LLMs are trained to predict a “statistically plausible” string of text, however statistical plausibility does not necessarily mean that the statements are factually accurate. As LLMs do not have a contextual understanding of the question they are being asked, or the answer they are proposing, they are unable to identify or correct any errors they make in their response. Care must be taken both in the use of LLMs, and in assessing returns that have used LLMs, in the form of additional due diligence.’

The PPN has the main advantage of trying to tackle the challenge of generative AI in procurement head on. It can help raise awareness in case someone was not yet talking about this and, more seriously, it includes an Annex A that brings together the several different bits of guidance issued by the UK government to date. However, the AI PPN does not elaborate on any of that guidance and is thus as limited as the Guidelines for AI procurement (see here), relatively complicated in that it points to rather different types of guidance ranging from ethics, to legal, to practical considerations, and requires significant knowledge and expertise to be operationalised (see here). Perhaps the best evidence of the complexity of the mushrooming sets of guidance is that the PPN itself includes in Annex A a reference to the January 2024 Guidance to civil servants on use of generative AI, which has been superseded by the Generative AI Framework for HMG, to which it also refers in Annex A. In other words, the AI PPN is not a ‘plug-and-play’ document setting out how to go about dealing with AI hallucinations and other risks in procurement. And given the pace of change in this area, it is also bound to be a PPN that requires multiple revisions and adaptations going forward.

A screenshot showing that the January guidance on generative AI use has been superseded (taken on 26 March 2024 10:20am).

More generally, the AI PPN is bound to be controversial and has already spurred insightful discussion on LinkedIn. I would recommend the posts by Kieran McGaughey and Ian Makgill. I offer some additional thoughts here and look forward to continuing the conversation.

In my view, one of the potential issues arising from the AI PPN is that it aims to cover quite a few different aspects of AI in procurement, as well as neglecting others. Slightly simplifying, there are three broad areas of AI-procurement interaction. First, there is the issue of buying AI-based solutions or services. Second, there is the issue of tenderers using (generative) AI to write or design their tenders. Third, there is the issue of the use of AI by contracting authorities, eg in relation to qualitative selection/exclusion, or evaluation/award decisions. The AI PPN covers aspects of . However, it is not clear to me that these can be treated together, as they pose significantly different policy issues. I will try to disentangle them here.

Buying and using AI

Although it mainly cross-refers to the Guidelines for AI procurement, the AI PPN includes some content relevant to the procurement and use of AI when it stresses that ‘Commercial teams should take note of existing guidance when purchasing AI services, however they should also be aware that AI and Machine Learning is becoming increasingly prevalent in the delivery of “non-AI” services. Where AI is likely to be used in the delivery of a service, commercial teams may wish to require suppliers to declare this, and provide further details. This will enable commercial teams to consider any additional due diligence or contractual amendments to manage the impact of AI as part of the service delivery.’ This is an adequate and potentially helpful warning. However, as discussed below, the PPN suggests a way to go about it that is in my view wrong and potentially very problematic.

AI-generated tenders

The AI PPN is however mostly concerned with the use of AI for tender generation. It recognises that there ‘are potential benefits to suppliers using AI to develop their bids, enabling them to bid for a greater number of public contracts. It is important to note that suppliers’ use of AI is not prohibited during the commercial process but steps should be taken to understand the risks associated with the use of AI tools in this context, as would be the case if a bid writer has been used by the bidder.’ It indicates some potential steps contracting authorities can take, such as:

  • ‘Asking suppliers to disclose their use of AI in the creation of their tender.’

  • ‘Undertaking appropriate and proportionate due diligence:

    • If suppliers use AI tools to create tender responses, additional due diligence may be required to ensure suppliers have the appropriate capacity and capability to fulfil the requirements of the contract. Such due diligence should be proportionate to any additional specific risk posed by the use of AI, and could include site visits, clarification questions or supplier presentations.

    • Additional due diligence should help to establish the accuracy, robustness and credibility of suppliers’ tenders through the use of clarifications or requesting additional supporting documentation in the same way contracting authorities would approach any uncertainty or ambiguity in tenders.’

  • ‘Potentially allowing more time in the procurement to allow for due diligence and an increase in volumes of responses.’

  • ‘Closer alignment with internal customers and delivery teams to bring greater expertise on the implications and benefits of AI, relative to the subject matter of the contract.’

In my view, there are a few problematic aspects here. While the AI PPN seems to try not to single out the use of generative AI as potentially problematic by equating it to the possible use of (human) bid writers, this is unconvincing. First, because there is (to my knowledge) no guidance whatsoever on an assessment of whether bid writers have been used, and because the AI PPN itself does not require disclosure of the engagement of bid writers (o puts any thought on the fact that third-party bid writers ma have used AI without this being known to the hiring tenderer, which would then require an extension of the disclosure of AI use further down the tender generation chain). Second, because the approach taken in the AI PP seems to point at potential problems with the use of (external, third-party) bid writers, whereas it does not seem to object to the use of (in-house) bid writers, potentially by much larger economic operators, which seems to presumptively not generate issues. Third, and most importantly, because it shows that perhaps not enough has been done so far to tackle the potential deceit or provision of misleading information in tenders if contracting authorities must now start thinking about how to get expert-based analysis of tenders, or develop fact-checking mechanisms to ensure bids are truthful. You would have thought that regardless of the origin of a tender, contracting authorities should be able to check their content to an adequate level of due diligence already.

In any case, the biggest issue with the AI PPN is how it suggests contracting authorities should deal with this issue, as discussed below.

AI-based assessments

The AI PPN also suggests that contracting authorities should be ‘Planning for a general increase in activity as suppliers may use AI to streamline or automate their processes and improve their bid writing capability and capacity leading to an increase in clarification questions and tender responses.’ One of the possibilities could be for contracting authorities to ‘fight fire with fire’ and also deploy generative AI (eg to make summaries, to scan for errors, etc). Interestingly, though, the AI PPN does not directly refer to the potential use of (generative) AI by contracting authorities.

While it includes a reference in Annex A to the Generative AI framework for HM Government, that document does not specifically address the use of generative AI to manage procurement processes (and what it says about buying generative AI is redundant given the other guidance in the Annex). In my view, the generative AI framework pushes strongly against the use of AI in procurement when it identifies a series of use cases to avoid (page 18) that include contexts where high-accuracy and high-explainability are required. If this is the government’s (justified) view, then the AI PPN has been a missed opportunity to say this more clearly and directly.

The broader issue of confidential, classified or proprietary information

Both in relation to the procurement and use of AI, and the use of AI for tender generation, the AI PPN stresses that it may be necessary:

  • ‘Putting in place proportionate controls to ensure bidders do not use confidential contracting authority information, or information not already in the public domain as training data for AI systems e.g. using confidential Government tender documents to train AI or Large Language Models to create future tender responses.‘; and that

  • ‘In certain procurements where there are national security concerns in relation to use of AI by suppliers, there may be additional considerations and risk mitigations that are required. In such instances, commercial teams should engage with their Information Assurance and Security colleagues, before launching the procurement, to ensure proportionate risk mitigations are implemented.’

These are issues that can easily exceed the technical capabilities of most contracting authorities. It is very hard to know what data has been used to train a model and economic operators using ‘off-the-shelf’ generative AI solutions will hardly be in a position to assess themselves, or provide any meaningful information, to contracting authorities. While there can be contractual constraints on the use of information and data generated under a given contract, it is much more challenging to assess whether information and data has been inappropriately used at a different link of increasingly complex digital supply chains. And, in any case, this is not only an issue for future contracts. Data and information generated under contracts already in place may not be subject to adequate data governance frameworks. It would seem that a more muscular approach to auditing data governance issues may be required, and that this should not be devolved to the procurement function.

How to deal with it? — or where the PPN goes wrong

The biggest weakness in the AI PPN is in how it suggests contracting authorities should deal with the issue of generative AI. In my view, it gets it wrong in two different ways. First, by asking for too much non-scored information where contracting authorities are unlikely to be able to act on it without breaching procurement and good administration principles. Second, by asking for too little non-scored information that contracting authorities are under a duty to score.

Too much information

The AI PPN includes two potential (alternative) disclosure questions in relation to the use of generative AI in tender writing (see below Q1 and Q2).

I think these questions miss the mark and expose contracting authorities to risks of challenge on grounds of a potential breach of the principle of equal treatment and the duty of good administration. The potential breach of the duty of good administration could be on grounds that the contracting authority is taking irrelevant information into account in the assessment of the relevant tender. The potential breach of equal treatment could come if tenders with some AI-generated elements were subjected to significantly more scrutiny than tenders where no AI was used. Contracting authorities should subject all tenders to the same level of due diligence and scrutiny because, at the bottom of it, there is no reason to ‘take a tenderer at its word’ when no AI is used. That is the entire logic of the exclusion, qualitative selection and evaluation processes.

Crucially, though, what the questions seem to really seek to ascertain is that the tenderer has checked for and confirms the accuracy of the content of the tender and thus makes the content its own and takes responsibility for it. This could be checked generally by asking all tenderers to confirm that the content of their tenders is correct and a true reflection of their capabilities and intended contractual delivery, reminding them that contracting authorities have tools to sanction economic operators that have ‘negligently provided misleading information that may have a material influence on decisions concerning exclusion, selection or award’ (reg.57(8)(i)(ii) PCR2015 and sch.7 13(2)(b) PA2023). And then enforcing them!

Checking the ‘authenticity’ of tenders when in fact contracting authorities are meant to check their truthfulness, accuracy and deliverability would be a false substitution of the relevant duties. It would also potentially eschew the incentives to disclose use of AI generation (lest contracting authorities find a reliable way of identifying it themselves and start applying the exclusion grounds above)—as thoroughly discussed in the LinkedIn posts referred to above.

too little information

Conversely, the PPN takes too soft and potentially confusing an approach to the use of AI to deliver the contract. The proposed disclosure question (Q3) is very problematic. It presents as ‘for information only’ a request for information on the use of AI or machine learning in the context of the actual delivery of the contract. This is information that will either relate to the technical specifications, award criteria or performance clauses (or all of them) and there is no meaningful way in which AI could be used to deliver the contract without this having an impact on the assessment and evaluation of the tender. The question is potentially misleading not only because of the indication that the information would not be scored, but also because it suggests that the use of AI in the delivery of a service or product is within the discretion of the tenderers. In my view, this would only be possible if the technical specifications were rather loosely written in performance terms, which would then require a very thorough description and assessment of how that performance is to be achieved. Moreover, the use of AI would probably require a set of organisational arrangements that should also not go unnoticed or unchecked in the procurement process. Moreover, one of the main challenges may not be in the use of AI in new contracts (were tenderers are likely to highlight it to stress the advantages, or to justify that their tenders are not abnormally low in comparison with delivery through ‘manual’ solutions), but in relation to pre-existing contracts. It also seems that a broader policy, recommendation and audit of the use of generative AI for the delivery of existing contracts and its treatment as a (permissible??) contract modification would have been needed.

Final thought

The AI PPN is an interesting development and will help crystallise many discussions that were somehow hovering in the background. However, a significant rethink is needed and, in my view, much more detailed guidance is needed in relation to the different dimensions of the interaction between AI and procurement. There are important questions that remain unaddressed and, in my view, one of the most pressing ones concerns the balance between general regulation and the use of procurement to regulate AI use. While the UK government remains committed to its ‘pro-innovation’ approach and no general regulation of AI use is put in place, in particular in relation to public sector AI use, procurement will continue to struggle and fail to act as a regulator of the technology.

Responsibly Buying Artificial Intelligence: A Regulatory Hallucination?

I look forward to delivering the lecture ‘Responsibly Buying Artificial Intelligence: A Regulatory Hallucination?’ as part of the Current Legal Problems Lecture Series 2023-24 organised by UCL Laws. The lecture will be this Thursday 23 November 2023 at 6pm GMT and you can still register to participate (either online or in person). These are the slides I will be using, in case you want to take a sneak peek. I will post a draft version of the paper after the lecture. Comments welcome!

Some thoughts on the US' Executive Order on the Safe, Secure, and Trustworthy Development and Use of AI

On 30 October 2023, President Biden adopted the Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (the ‘AI Executive Order’, see also its Factsheet). The use of AI by the US Federal Government is an important focus of the AI Executive Order. It will be subject to a new governance regime detailed in the Draft Policy on the use of AI in the Federal Government (the ‘Draft AI in Government Policy’, see also its Factsheet), which is open for comment until 5 December 2023. Here, I reflect on these documents from the perspective of AI procurement as a major plank of this governance reform.

Procurement in the AI Executive Order

Section 2 of the AI Executive Order formulates eight guiding principles and priorities in advancing and governing the development and use of AI. Section 2(g) refers to AI risk management, and states that

It is important to manage the risks from the Federal Government’s own use of AI and increase its internal capacity to regulate, govern, and support responsible use of AI to deliver better results for Americans. These efforts start with people, our Nation’s greatest asset. My Administration will take steps to attract, retain, and develop public service-oriented AI professionals, including from underserved communities, across disciplines — including technology, policy, managerial, procurement, regulatory, ethical, governance, and legal fields — and ease AI professionals’ path into the Federal Government to help harness and govern AI. The Federal Government will work to ensure that all members of its workforce receive adequate training to understand the benefits, risks, and limitations of AI for their job functions, and to modernize Federal Government information technology infrastructure, remove bureaucratic obstacles, and ensure that safe and rights-respecting AI is adopted, deployed, and used.

Section 10 then establishes specific measures to advance Federal Government use of AI. Section 10.1(b) details a set of governance reforms to be implemented in view of the Director of the Office of Management and Budget (OMB)’s guidance to strengthen the effective and appropriate use of AI, advance AI innovation, and manage risks from AI in the Federal Government. Section 10.1(b) includes the following (emphases added):

The Director of OMB’s guidance shall specify, to the extent appropriate and consistent with applicable law:

(i) the requirement to designate at each agency within 60 days of the issuance of the guidance a Chief Artificial Intelligence Officer who shall hold primary responsibility in their agency, in coordination with other responsible officials, for coordinating their agency’s use of AI, promoting AI innovation in their agency, managing risks from their agency’s use of AI …;

(ii) the Chief Artificial Intelligence Officers’ roles, responsibilities, seniority, position, and reporting structures;

(iii) for [covered] agencies […], the creation of internal Artificial Intelligence Governance Boards, or other appropriate mechanisms, at each agency within 60 days of the issuance of the guidance to coordinate and govern AI issues through relevant senior leaders from across the agency;

(iv) required minimum risk-management practices for Government uses of AI that impact people’s rights or safety, including, where appropriate, the following practices derived from OSTP’s Blueprint for an AI Bill of Rights and the NIST AI Risk Management Framework: conducting public consultation; assessing data quality; assessing and mitigating disparate impacts and algorithmic discrimination; providing notice of the use of AI; continuously monitoring and evaluating deployed AI; and granting human consideration and remedies for adverse decisions made using AI;

(v) specific Federal Government uses of AI that are presumed by default to impact rights or safety;

(vi) recommendations to agencies to reduce barriers to the responsible use of AI, including barriers related to information technology infrastructure, data, workforce, budgetary restrictions, and cybersecurity processes;

(vii) requirements that [covered] agencies […] develop AI strategies and pursue high-impact AI use cases;

(viii) in consultation with the Secretary of Commerce, the Secretary of Homeland Security, and the heads of other appropriate agencies as determined by the Director of OMB, recommendations to agencies regarding:

(A) external testing for AI, including AI red-teaming for generative AI, to be developed in coordination with the Cybersecurity and Infrastructure Security Agency;

(B) testing and safeguards against discriminatory, misleading, inflammatory, unsafe, or deceptive outputs, as well as against producing child sexual abuse material and against producing non-consensual intimate imagery of real individuals (including intimate digital depictions of the body or body parts of an identifiable individual), for generative AI;

(C) reasonable steps to watermark or otherwise label output from generative AI;

(D) application of the mandatory minimum risk-management practices defined under subsection 10.1(b)(iv) of this section to procured AI;

(E) independent evaluation of vendors’ claims concerning both the effectiveness and risk mitigation of their AI offerings;

(F) documentation and oversight of procured AI;

(G) maximizing the value to agencies when relying on contractors to use and enrich Federal Government data for the purposes of AI development and operation;

(H) provision of incentives for the continuous improvement of procured AI; and

(I) training on AI in accordance with the principles set out in this order and in other references related to AI listed herein; and

(ix) requirements for public reporting on compliance with this guidance.

Section 10.1(b) of the AI Executive Order establishes two sets or types of requirements.

First, there are internal governance requirements and these revolve around the appointment of Chief Artificial Intelligence Officers (CAIOs), AI Governance Boards, their roles, and support structures. This set of requirements seeks to strengthen the ability of Federal Agencies to understand AI and to provide effective safeguards in its governmental use. The crucial set of substantive protections from this internal perspective derives from the required minimum risk-management practices for Government uses of AI, which is directly placed under the responsibility of the relevant CAIO.

Second, there are external (or relational) governance requirements that revolve around the agency’s ability to control and challenge tech providers. This involves the transfer (back to back) of minimum risk-management practices to AI contractors, but also includes commercial considerations. The tone of the Executive Order indicates that this set of requirements is meant to neutralise risks of commercial capture and commercial determination by imposing oversight and external verification. From an AI procurement governance perspective, the requirements in Section 10.1(b)(viii) are particularly relevant. As some of those requirements will need further development with a view to their operationalisation, Section 10.1(d)(ii) of the AI Executive Order requires the Director of OMB to develop an initial means to ensure that agency contracts for the acquisition of AI systems and services align with its Section 10.1(b) guidance.

Procurement in the Draft AI in Government Policy

The guidance required by Section 10.1(b) of the AI Executive Order has been formulated in the Draft AI in Government Policy, which offers more detail on the relevant governance mechanisms and the requirements for AI procurement. Section 5 on managing risks from the use of AI is particularly relevant from an AI procurement perspective. While Section 5(d) refers explicitly to managing risks in AI procurement, given that the primary substantive obligations will arise from the need to comply with the required minimum risk-management practices for Government uses of AI, this specific guidance needs to be read in the broader context of AI risk-management within Section 5 of the Draft AI in Government Policy.

Scope

The Draft AI in Government Policy relies on a tiered approach to AI risk by imposing specific obligations in relation to safety-impacting and rights-impacting AI only. This is an important element of the policy because these two categories are defined (in Section 6) and in principle will cover pre-established lists of AI use, based on a set of presumptions (Section 5(b)(i) and (ii)). However, CAIOs will be able to waive the application of minimum requirements for specific AI uses where, ‘based upon a system-specific risk assessment, [it is shown] that fulfilling the requirement would increase risks to safety or rights overall or would create an unacceptable impediment to critical agency operations‘ (Section 5(c)(iii)). Therefore, these are not closed lists and the specific scope of coverage of the policy will vary with such determinations. There are also some exclusions from minimum requirements where the AI is used for narrow purposes (Section 5(c)(i))—notably the ‘Evaluation of a potential vendor, commercial capability, or freely available AI capability that is not otherwise used in agency operations, solely for the purpose of making a procurement or acquisition decision’; AI evaluation in the context of regulatory enforcement, law enforcement or national security action; or research and development.

This scope of the policy may be under-inclusive, or generate risks of under-inclusiveness at the boundary, in two respects. First, the way AI is defined for the purposes of the Draft AI in Government Policy, excludes ‘robotic process automation or other systems whose behavior is defined only by human-defined rules or that learn solely by repeating an observed practice exactly as it was conducted’ (Section 6). This could be under-inclusive to the extent that the minimum risk-management practices for Government uses of AI create requirements that are not otherwise applicable to Government use of (non-AI) algorithms. There is a commonality of risks (eg discrimination, data governance risks) that would be better managed if there was a joined up approach. Moreover, developing minimum practices in relation to those means of automation would serve to develop institutional capability that could then support the adoption of AI as defined in the policy. Second, the variability in coverage stemming from consideration of ‘unacceptable impediments to critical agency operations‘ opens the door to potentially problematic waivers. While these are subject to disclosure and notification to OMB, it is not entirely clear on what grounds OMB could challenge those waivers. This is thus an area where the guidance may require further development.

extensions and waivers

In relation to covered safety-impacting or rights-impacting AI (as above), Section 5(a)(i) establishes the important principle that US Federal Government agencies have until 1 August 2024 to implement the minimum practices in Section 5(c), ‘or else stop using any AI that is not compliant with the minimum practices’. This type of sunset clause concerning the currently implicit authorisation for the use of AI is a potentially powerful mechanism. However, the Draft also establishes that such obligation to discontinue non-compliant AI use must be ‘consistent with the details and caveats in that section [5(c)]’, which includes the possibility, until 1 August 2024, for agencies to

request from OMB an extension of limited and defined duration for a particular use of AI that cannot feasibly meet the minimum requirements in this section by that date. The request must be accompanied by a detailed justification for why the agency cannot achieve compliance for the use case in question and what practices the agency has in place to mitigate the risks from noncompliance, as well as a plan for how the agency will come to implement the full set of required minimum practices from this section.

Again, the guidance does not detail on what grounds OMB would grant those extensions or how long they would be for. There is a clear interaction between the extension and waiver mechanism. For example, an agency that saw its request for an extension declined could try to waive that particular AI use—or agencies could simply try to waive AI uses rather than applying for extensions, as the requirements for a waiver seem to be rather different (and potentially less demanding) than those applicable to a waiver. In that regard, it seems that waiver determinations are ‘all or nothing’, whereas the system could be more flexible (and protective) if waiver decisions not only needed to explain why meeting the minimum requirements would generate the heightened overall risks or pose such ‘unacceptable impediments to critical agency operations‘, but also had to meet the lower burden of mitigation currently expected in extension applications, concerning detailed justification for what practices the agency has in place to mitigate the risks from noncompliance where they can be partly mitigated. In other words, it would be preferable to have a more continuous spectrum of mitigation measures in the context of waivers as well.

general minimum practices

Both in relation to safety- and rights-impact AI uses, the Draft AI in Government Policy would require agencies to engage in risk management both before and while using AI.

Preventative measures include:

  • completing an AI Impact Assessment documenting the intended purpose of the AI and its expected benefit, the potential risks of using AI, and and analysis of the quality and appropriateness of the relevant data;

  • testing the AI for performance in a real-world context—that is, testing under conditions that ‘mirror as closely as possible the conditions in which the AI will be deployed’; and

  • independently evaluate the AI, with the particularly important requirement that ‘The independent reviewing authority must not have been directly involved in the system’s development.’ In my view, it would also be important for the independent reviewing authority not to be involved in the future use of the AI, as its (future) operational interest could also be a source of bias in the testing process and the analysis of its results.

In-use measures include:

  • conducting ongoing monitoring and establish thresholds for periodic human review, with a focus on monitoring ‘degradation to the AI’s functionality and to detect changes in the AI’s impact on rights or safety’—‘human review, including renewed testing for performance of the AI in a real-world context, must be conducted at least annually, and after significant modifications to the AI or to the conditions or context in which the AI is used’;

  • mitigating emerging risks to rights and safety—crucially, ‘Where the AI’s risks to rights or safety exceed an acceptable level and where mitigation is not practicable, agencies must stop using the affected AI as soon as is practicable’. In that regard, the draft indicates that ‘Agencies are responsible for determining how to safely decommission AI that was already in use at the time of this memorandum’s release without significant disruptions to essential government functions’, but it would seem that this is also a process that would benefit from close oversight by OMB as it would otherwise jeopardise the effectiveness of the extension and waiver mechanisms discussed above—in which case additional detail in the guidance would be required;

  • ensuring adequate human training and assessment;

  • providing appropriate human consideration as part of decisions that pose a high risk to rights or safety; and

  • providing public notice and plain-language documentation through the AI use case inventory—however, this is subject a large number of caveats (notice must be ‘consistent with applicable law and governmentwide guidance, including those concerning protection of privacy and of sensitive law enforcement, national security, and other protected information’) and more detailed guidance on how to assess these issues would be welcome (if it exists, a cross-reference in the draft policy would be helpful).

additional minimum practices for rights-impacting ai

In relation to rights-affecting AI only, the Draft AI in Government Policy would require agencies to take additional measures.

Preventative measures include:

  • take steps to ensure that the AI will advance equity, dignity, and fairness—including proactively identifying and removing factors contributing to algorithmic discrimination or bias; assessing and mitigating disparate impacts; and using representative data; and

  • consult and incorporate feedback from affected groups.

In-use measures include:

  • conducting ongoing monitoring and mitigation for AI-enabled discrimination;

  • notifying negatively affected individuals—this is an area where the draft guidance is rather woolly, as it also includes a set of complex caveats, as individual notice that ‘AI meaningfully influences the outcome of decisions specifically concerning them, such as the denial of benefits’ must only be given ‘[w]here practicable and consistent with applicable law and governmentwide guidance’. Moreover, the draft only indicates that ‘Agencies are also strongly encouraged to provide explanations for such decisions and actions’, but not required to. In my view, this tackles two of the most important implications for individuals in Government use of AI: the possibility to understand why decisions are made (reason giving duties) and the burden of challenging automated decisions, which is increased if there is a lack of transparency on the automation. Therefore, on this point, the guidance seems too tepid—especially bearing in mind that this requirement only applies to ‘AI whose output serves as a basis for decision or action that has a legal, material, or similarly significant effect on an individual’s’ civil rights, civil liberties, or privacy; equal opportunities; or access to critical resources or services. In these cases, it seems clear that notice and explainability requirements need to go further.

  • maintaining human consideration and remedy processes—including ‘potential remedy to the use of the AI by a fallback and escalation system in the event that an impacted individual would like to appeal or contest the AI’s negative impacts on them. In developing appropriate remedies, agencies should follow OMB guidance on calculating administrative burden and the remedy process should not place unnecessary burden on the impacted individual. When law or governmentwide guidance precludes disclosure of the use of AI or an opportunity for an individual appeal, agencies must create appropriate mechanisms for human oversight of rights-impacting AI’. This is another crucial area concerning rights not to be subjected to fully-automated decision-making where there is no meaningful remedy. This is also an area of the guidance that requires more detail, especially as to what is the adequate balance of burdens where eg the agency can automate the undoing of negative effects on individuals identified as a result of challenges by other individuals or in the context of the broader monitoring of the functioning and effects of the rights-impacting AI. In my view, this would be an opportunity to mandate automation of remediation in a meaningful way.

  • maintaining options to opt-out where practicable.

procurement related practices

In addition to the need for agencies to be able to meet the above requirements in relation to procured AI—which will in itself create the need to cascade some of the requirements down to contractors, and which will be the object of future guidance on how to ensure that AI contracts align with the requirements—the Draft AI in Government Policy also requires that agencies procuring AI manage risks by:

  • aligning to National Values and Law by ensuring ‘that procured AI exhibits due respect for our Nation’s values, is consistent with the Constitution, and complies with all other applicable laws, regulations, and policies, including those addressing privacy, confidentiality, copyright, human and civil rights, and civil liberties’;

  • taking ‘steps to ensure transparency and adequate performance for their procured AI, including by: obtaining adequate documentation of procured AI, such as through the use of model, data, and system cards; regularly evaluating AI-performance claims made by Federal contractors, including in the particular environment where the agency expects to deploy the capability; and considering contracting provisions that incentivize the continuous improvement of procured AI’;

  • taking ‘appropriate steps to ensure that Federal AI procurement practices promote opportunities for competition among contractors and do not improperly entrench incumbents. Such steps may include promoting interoperability and ensuring that vendors do not inappropriately favor their own products at the expense of competitors’ offering’;

  • maximizing the value of data for AI; and

  • responsibly procuring Generative AI.

These high level requirements are well targeted and compliance with them would go a long way to fostering ‘responsible AI procurement’ through adequate risk mitigation in ways that still allow the procurement mechanism to harness market forces to generate value for money.

However, operationalising these requirements will be complex and the further OMB guidance should be rather detailed and practical.

Final thoughts

In my view, the AI Executive Order and the Draft AI in Government Policy lay the foundations for a significant strengthening of the governance of AI procurement with a view to embedding safeguards in public sector AI use. A crucially important characteristic in the design of these governance mechanisms is that it imposes significant duties on the agencies seeking to procure and use the AI, and it explicitly seeks to address risks of commercial capture and commercial determination. Another crucially important characteristic is that, at least in principle, use of AI is made conditional on compliance with a rather comprehensive set of preventative and in-use risk mitigation measures. The general aspects of this governance approach thus offer a very valuable blueprint for other jurisdictions considering how to boost AI procurement governance.

However, as always, the devil is in the details. One of the crucial risks in this approach to AI governance concerns a lack of independence of the entities making the relevant assessments. In the Draft AI in Government Policy, there are some risks of under-inclusion and/or excessive waivers of compliance with the relevant requirements (both explicit and implicit, through protracted processes of decommissioning of non-compliant AI), as well as a risk that ‘practical considerations’ will push compliance with the risk mitigation requirements well past the (ambitious) 1 August 2024 deadline through long or rolling extensions.

To mitigate for this, the guidance should be much clearer on the role of OMB in extension, waiver and decommissioning decisions, as well as in relation to the specific criteria and limits that should form part of those decisions. Only by ensuring adequate OMB intervention can a system of governance that still does not entirely (organisationally) separate procurement, use and oversight decisions reach the levels of independent verification required not only to neutralise commercial determination, but also operational dependency and the ‘policy irresistibility’ of digital technologies.

Thoughts on the AI Safety Summit from a public sector procurement & use of AI perspective

The UK Government hosted an AI Safety Summit on 1-2 November 2023. A summary of the targeted discussions in a set of 8 roundtables has been published for Day 1, as well as a set of Chair’s statements for Day 2, including considerations around safety testing, the state of the science, and a general summary of discussions. There is also, of course, the (flagship?) Bletchley Declaration, and an introduction to the announced AI Safety Institute (UK AISI).

In this post, I collect some of my thoughts on these outputs of the AI Safety Summit from the perspective of public sector procurement and use of AI.

What was said at the AI safety Summit?

Although the summit was narrowly targeted to discussion of ‘frontier AI’ as particularly advanced AI systems, some of the discussions seem to have involved issues also applicable to less advanced (ie currently in existence) AI systems, and even to non-AI algorithms used by the public sector. As the general summary reflects, ‘There was also substantive discussion of the impact of AI upon wider societal issues, and suggestions that such risks may themselves pose an urgent threat to democracy, human rights, and equality. Participants expressed a range of views as to which risks should be prioritised, noting that addressing frontier risks is not mutually exclusive from addressing existing AI risks and harms.’ Crucially, ‘participants across both days noted a range of current AI risks and harmful impacts, and reiterated the need for them to be tackled with the same energy, cross-disciplinary expertise, and urgency as risks at the frontier.’ Hopefully, then, some of the rather far-fetched discussions of future existential risks can be conducive to taking action on current harms and risks arising from the procurement and use of less advanced systems.

There seemed to be some recognition of the need for more State intervention through regulation, for more regulatory control of standard-setting, and for more attention to be paid to testing and evaluation in the procurement context. For example, the summary of Day 1 discussions indicates that participants agreed that

  • ‘We should invest in basic research, including in governments’ own systems. Public procurement is an opportunity to put into practice how we will evaluate and use technology.’ (Roundtable 4)

  • ‘Company policies are just the baseline and don’t replace the need for governments to set standards and regulate. In particular, standardised benchmarks will be required from trusted external third parties such as the recently announced UK and US AI Safety Institutes.’ (Roundtable 5)

In Day 2, in the context of safety testing, participants agreed that

  • Governments have a responsibility for the overall framework for AI in their countries, including in relation to standard setting. Governments recognise their increasing role for seeing that external evaluations are undertaken for frontier AI models developed within their countries in accordance with their locally applicable legal frameworks, working in collaboration with other governments with aligned interests and relevant capabilities as appropriate, and taking into account, where possible, any established international standards.

  • Governments plan, depending on their circumstances, to invest in public sector capability for testing and other safety research, including advancing the science of evaluating frontier AI models, and to work in partnership with the private sector and other relevant sectors, and other governments as appropriate to this end.

  • Governments will plan to collaborate with one another and promote consistent approaches in this effort, and to share the outcomes of these evaluations, where sharing can be done safely, securely and appropriately, with other countries where the frontier AI model will be deployed.

This could be a basis on which to build an international consensus on the need for more robust and decisive regulation of AI development and testing, as well as a consensus of the sets of considerations and constraints that should be applicable to the procurement and use of AI by the public sector in a way that is compliant with individual (human) rights and social interests. The general summary reflects that ‘Participants welcomed the exchange of ideas and evidence on current and upcoming initiatives, including individual countries’ efforts to utilise AI in public service delivery and elsewhere to improve human wellbeing. They also affirmed the need for the benefits of AI to be made widely available’.

However, some statements seem at first sight contradictory or problematic. While the excerpt above stresses that ‘Governments have a responsibility for the overall framework for AI in their countries, including in relation to standard setting’ (emphasis added), the general summary also stresses that ‘The UK and others recognised the importance of a global digital standards ecosystem which is open, transparent, multi-stakeholder and consensus-based and many standards bodies were noted, including the International Standards Organisation (ISO), International Electrotechnical Commission (IEC), Institute of Electrical and Electronics Engineers (IEEE) and relevant study groups of the International Telecommunication Union (ITU).’ Quite how State responsibility for standard setting fits with industry-led standard setting by such organisations is not only difficult to fathom, but also one of the potentially most problematic issues due to the risk of regulatory tunnelling that delegation of standard setting without a verification or certification mechanism entails.

Moreover, there seemed to be insufficient agreement around crucial issues, which are summarised as ‘a set of more ambitious policies to be returned to in future sessions’, including:

‘1. Multiple participants suggested that existing voluntary commitments would need to be put on a legal or regulatory footing in due course. There was agreement about the need to set common international standards for safety, which should be scientifically measurable.

2. It was suggested that there might be certain circumstances in which governments should apply the principle that models must be proven to be safe before they are deployed, with a presumption that they are otherwise dangerous. This principle could be applied to the current generation of models, or applied when certain capability thresholds were met. This would create certain ‘gates’ that a model had to pass through before it could be deployed.

3. It was suggested that governments should have a role in testing models not just pre- and post-deployment, but earlier in the lifecycle of the model, including early in training runs. There was a discussion about the ability of governments and companies to develop new tools to forecast the capabilities of models before they are trained.

4. The approach to safety should also consider the propensity for accidents and mistakes; governments could set standards relating to how often the machine could be allowed to fail or surprise, measured in an observable and reproducible way.

5. There was a discussion about the need for safety testing not just in the development of models, but in their deployment, since some risks would be contextual. For example, any AI used in critical infrastructure, or equivalent use cases, should have an infallible off-switch.

8. Finally, the participants also discussed the question of equity, and the need to make sure that the broadest spectrum was able to benefit from AI and was shielded from its harms.’

All of these are crucial considerations in relation to the regulation of AI development, (procurement) and use. A lack of consensus around these issues already indicates that there was a generic agreement that some regulation is necessary, but much more limited agreement on what regulation is necessary. This is clearly reflected in what was actually agreed at the summit.

What was agreed at the AI Safety Summit?

Despite all the discussions, little was actually agreed at the AI Safety Summit. The Blethcley Declaration includes a lengthy (but rather uncontroversial?) description of the potential benefits and actual risks of (frontier) AI, some rather generic agreement that ‘something needs to be done’ (eg welcoming ‘the recognition that the protection of human rights, transparency and explainability, fairness, accountability, regulation, safety, appropriate human oversight, ethics, bias mitigation, privacy and data protection needs to be addressed’) and very limited and unspecific commitments.

Indeed, signatories only ‘committed’ to a joint agenda, comprising:

  • ‘identifying AI safety risks of shared concern, building a shared scientific and evidence-based understanding of these risks, and sustaining that understanding as capabilities continue to increase, in the context of a wider global approach to understanding the impact of AI in our societies.

  • building respective risk-based policies across our countries to ensure safety in light of such risks, collaborating as appropriate while recognising our approaches may differ based on national circumstances and applicable legal frameworks. This includes, alongside increased transparency by private actors developing frontier AI capabilities, appropriate evaluation metrics, tools for safety testing, and developing relevant public sector capability and scientific research’ (emphases added).

This does not amount to much that would not happen anyway and, given that one of the UK Government’s objectives for the Summit was to create mechanisms for global collaboration (‘a forward process for international collaboration on frontier AI safety, including how best to support national and international frameworks’), this agreement for each jurisdiction to do things as they see fit in accordance to their own circumstances and collaborate ‘as appropriate’ in view of those seems like a very poor ‘win’.

In reality, there seems to be little coming out of the Summit other than a plan to continue the conversations in 2024. Given what had been said in one of the roundtables (num 5) in relation to the need to put in place adequate safeguards: ‘this work is urgent, and must be put in place in months, not years’; it looks like the ‘to be continued’ approach won’t do or, at least, cannot be claimed to have made much of a difference.

What did the UK Government promise in the AI Summit?

A more specific development announced with the occasion of the Summit (and overshadowed by the earlier US announcement) is that the UK will create the AI Safety Institute (UK AISI), a ‘state-backed organisation focused on advanced AI safety for the public interest. Its mission is to minimise surprise to the UK and humanity from rapid and unexpected advances in AI. It will work towards this by developing the sociotechnical infrastructure needed to understand the risks of advanced AI and enable its governance.’

Crucially, ‘The Institute will focus on the most advanced current AI capabilities and any future developments, aiming to ensure that the UK and the world are not caught off guard by progress at the frontier of AI in a field that is highly uncertain. It will consider open-source systems as well as those deployed with various forms of access controls. Both AI safety and security are in scope’ (emphasis added). This seems to carry forward the extremely narrow focus on ‘frontier AI’ and catastrophic risks that augured a failure of the Summit. It is also in clear contrast with the much more sensible and repeated assertions/consensus in that other types of AI cause very significant risks and that there is ‘a range of current AI risks and harmful impacts, and reiterated the need for them to be tackled with the same energy, cross-disciplinary expertise, and urgency as risks at the frontier.’

Also crucially, UK AISI ‘is not a regulator and will not determine government regulation. It will collaborate with existing organisations within government, academia, civil society, and the private sector to avoid duplication, ensuring that activity is both informing and complementing the UK’s regulatory approach to AI as set out in the AI Regulation white paper’.

According to initial plans, UK AISI ‘will initially perform 3 core functions:

  • Develop and conduct evaluations on advanced AI systems, aiming to characterise safety-relevant capabilities, understand the safety and security of systems, and assess their societal impacts

  • Drive foundational AI safety research, including through launching a range of exploratory research projects and convening external researchers

  • Facilitate information exchange, including by establishing – on a voluntary basis and subject to existing privacy and data regulation – clear information-sharing channels between the Institute and other national and international actors, such as policymakers, international partners, private companies, academia, civil society, and the broader public’

It is also stated that ‘We see a key role for government in providing external evaluations independent of commercial pressures and supporting greater standardisation and promotion of best practice in evaluation more broadly.’ However, the extent to which UK AISI will be able to do that will hinge on issues that are not currently clear (or publicly disclosed), such as the membership of UK AISI or its institutional set up (as ‘state-backed organisation’ does not say much about this).

On that very point, it is somewhat problematic that the UK AISI ‘is an evolution of the UK’s Frontier AI Taskforce. The Frontier AI Taskforce was announced by the Prime Minister and Technology Secretary in April 2023’ (ahem, as ‘Foundation Model Taskforce’—so this is the second rebranding of the same initiative in half a year). As is problematic that UK AISI ‘will continue the Taskforce’s safety research and evaluations. The other core parts of the Taskforce’s mission will remain in [the Department for Science, Innovation and Technology] as policy functions: identifying new uses for AI in the public sector; and strengthening the UK’s capabilities in AI.’ I find the retention of analysis pertaining to public sector AI use within government problematic and a clear indication of the UK’s Government unwillingness to put meaningful mechanisms in place to monitor the process of public sector digitalisation. UK AISI very much sounds like a research institute with a focus on a very narrow set of AI systems and with a remit that will hardly translate into relevant policymaking in areas in dire need of regulation. Finally, it is also very problematic that funding is not locked: ‘The Institute will be backed with a continuation of the Taskforce’s 2024 to 2025 funding as an annual amount for the rest of this decade, subject to it demonstrating the continued requirement for that level of public funds.’ In reality, this means that the Institute’s continued existence will depend on the Government’s satisfaction with its work and the direction of travel of its activities and outputs. This is not at all conducive to independence, in my view.

So, all in all, there is very little new in the announcement of the creation of the UK AISI and, while there is a (theoretical) possibility for the Institute to make a positive contribution to regulating AI procurement and use (in the public sector), this seems extremely remote and potentially undermined by the Institute’s institutional set up. This is probably in stark contrast with the US approach the UK is trying to mimic (though more on the US approach in a future entry).

European Commission wants to see more AI procurement. Ok, but priorities need reordering

The European Commission recently published its 2023 State of the Digital Decade report. One of its key takeaways is that the Commission recommends Member States to step up innovation procurement investments in digital sector.

The Commission has identified that ‘While the roll-out of digital public services is progressing steadily, investment in public procurement of innovative digital solutions (e.g. based on AI or big data) is insufficient and would need to increase substantially from EUR 188 billon to EUR 295 billon in order to reach full speed adoption of innovative digital solutions in public services’ (para 4.2, original emphasis).

The Commission has thus recommended that ‘Member States should step up investment and regulatory measures to develop and make available secure, sovereign and interoperable digital solutions for online public and government services’; and that ‘Member States should develop action plans in support of innovation procurement and step up efforts to increase public procurement investments in developing, testing and deploying innovative digital solutions’.

Tucked away in a different part of the report (which, frankly, has a rather odd structure), the Commission also recommends that ‘Member States should foster the availability of legal and technical support to procure and implement trustworthy and sovereign AI solutions across sectors.’

To my mind, the priorities for investment of public money need to be further clarified. Without a significant investment in an ambitious plan to quickly expand the public sector’s digital skills and capabilities, there can be no hope that increased procurement expenditure in digital technologies will bring adequate public sector digitalisation or foster the public interest more broadly.

Without a sophisticated public buyer that can adequately cut through the process of technological innovation, there is no hope that ‘throwing money at the problem’ will bring meaningful change. In my view, the focus and priority should be on upskilling the public sector before anything else—including ahead of the also recommended mobilisation of ‘public policies, including innovative procurement to foster the scaling up of start-ups, to facilitate the creation of spinoffs from universities and research centres, and to monitor progress in this area’ (para 3.2.3). Perhaps a substantial fraction of the 100+ billion EUR the Commission expects Member States to put into public sector digitalisation could go to building up the required capability… too much to ask?

AI in the public sector: can procurement promote trustworthy AI and avoid commercial capture?

The recording and slides of the public lecture on ‘AI in the public sector: can procurement promote trustworthy AI and avoid commercial capture?’ I gave at the University of Bristol Law School on 4 July 2023 are now available. As always, any further comments most warmly received at: a.sanchez-graells@bristol.ac.uk.

This lecture brought my research project to an end. I will now focus on finalising the manuscript and sending it off to the publisher, and then take a break for the rest of the summer. I will share details of the forthcoming monograph in a few months. I hope to restart blogging in September. in the meantime, I wish all HTCaN friends all the best. Albert

Two policy briefings on digital technologies and procurement

Now that my research project ‘Digital technologies and public procurement. Gatekeeping and experimentation in digital public governance’ nears its end, some outputs start to emerge. In this post, I would like to highlight two policy briefings summarising some of my top-level policy recommendations, and providing links to more detailed analysis. All materials are available in the ‘Digital Procurement Governance’ tab.

Policy Briefing 1: ‘Guaranteeing public sector adoption of trustworthy AI - a task that should not be left to procurement

What's the rush -- some thoughts on the UK's Foundation Model Taskforce and regulation by Twitter

I have been closely following developments on AI regulation in the UK, as part of the background research for the joint submission to the public consultation closing on Wednesday (see here and here). Perhaps surprisingly, the biggest developments do not concern the regulation of AI under the devolved model described in the ‘pro-innovation’ white paper, but its displacement outside existing regulatory regimes—both in terms of funding, and practical power.

Most of the activity and investments are not channelled towards existing resource-strained regulators to support them in their task of issuing guidance on how to deal with AI risks and harms—which stems from the white paper—but in digital industrial policy and R&D projects, including a new major research centre on responsible and trustworthy AI and a Foundation Model Taskforce. A first observation is that this type of investments can be worthwhile, but not at the expense of adequately resourcing regulators facing the tall order of AI regulation.

The UK’s Primer Minister is clearly making a move to use ‘world-leadership in AI safety’ as a major plank of his re-election bid in the coming Fall. I am not only sceptical about this move and its international reception, but also increasingly concerned about a tendency to ‘regulate by Twitter’ and to bullish approaches to regulatory and legal compliance that could well result in squandering a good part of the £100m set aside for the Taskforce.

In this blog, I offer some preliminary thoughts. Comments welcome!

Twitter announcements vs white paper?

During the preparation of our response to the AI public consultation, we had a moment of confusion. The Government published the white paper and an impact assessment supporting it, which primarily amount to doing nothing and maintaining the status quo (aka AI regulatory gap) in the UK. However, there were increasing reports of the Prime Minister’s change of heart after the emergence of a ‘doomer’ narrative peddled by OpenAI’s CEO and others. At some point, the PM sent out a tweet that made us wonder if the Government was changing policy and the abandoning the approach of the white paper even before the end of the public consultation. This was the tweet.

We could not locate any document describing the ‘Safe strategy of AI’, so the only conclusion we could reach is that the ‘strategy’ was the short twitter threat that followed that first tweet.

It was not only surprising that there was no detail, but also that there was no reference to the white paper or to any other official policy document. We were probably not the only ones confused about it (or so we hope!) as it is in general very confusing to have social media messaging pointing out towards regulatory interventions completely outside the existing frameworks—including live public consultations by the government!

It is also confusing to see multiple different documents make reference to different things, and later documents somehow reframing what previous documents mean.

For example, the announcement of the Foundation Model Taskforce came only a few weeks after the publication of the white paper, but there was no mention of it in the white paper itself. Is it possible that the Government had put together a significant funding package and related policy in under a month? Rather than whether it is possible, the question is why do things in this way? And how mature was the thinking behind the Taskforce?

For example, the initial announcement indicated that

The investment will build the UK’s ‘sovereign’ national capabilities so our public services can benefit from the transformational impact of this type of AI. The Taskforce will focus on opportunities to establish the UK as a world leader in foundation models and their applications across the economy, and acting as a global standard bearer for AI safety.

The funding will be invested by the Foundation Model Taskforce in foundation model infrastructure and public service procurement, to create opportunities for domestic innovation. The first pilots targeting public services are expected to launch in the next six months.

Less than two months later, the announcement of the appointment of the Taskforce chair (below) indicated that

… a key focus for the Taskforce in the coming months will be taking forward cutting-edge safety research in the run up to the first global summit on AI safety to be hosted in the UK later this year.

Bringing together expertise from government, industry and academia, the Taskforce will look at the risks surrounding AI. It will carry out research on AI safety and inform broader work on the development of international guardrails, such as shared safety and security standards and infrastructure, that could be put in place to address the risks.

Is it then a Taskforce and pot of money seeking to develop sovereign capabilities and to pilot public sector AI use, or a Taskforce seeking to develop R&D in AI safety? Can it be both? Is there money for both? Also, why steer the £100m Taskforce in this direction and simultaneously spend £31m in funding an academic-led research centre on ethical and trustworthy AI? Is the latter not encompassing issues of AI safety? How will all of these investments and initiatives be coordinated to avoid duplication of effort or replication of regulatory gaps in the disparate consideration of regulatory issues?

Funding and collaboration opportunities announced via Twitter?

Things can get even more confusing or worrying (for me). Yesterday, the Government put out an official announcement and heavy Twitter-based PR to announce the appointment of the Chair of the Foundation Model Taskforce. This announcement raises a few questions. Why on Sunday? What was the rush? Also, what was the process used to select the Chair, if there was one? I have no questions on the profile and suitability of the appointed Chair (have also not looked at them in detail), but I wonder … even if legally compliant to proceed without a formal process with an open call for expressions of interest, is this appropriate? Is the Government stretching the parallelism with the Vaccines Taskforce too far?

Relatedly, there has been no (or I have been unable to locate) official call for expressions of interest from those seeking to get involved with the Taskforce. However, once more, Twitter seems to have been the (pragmatic?) medium used by the newly appointed Chair of the Taskforce. On Sunday itself, this Twitter thread went out:

I find the last bit particularly shocking. A call for expressions of interest in participating in a project capable of spending up to £100m via Google Forms! (At the time of writing), the form is here and its content is as follows:

I find this approach to AI regulation rather concerning and can also see quite a few ways in which the emerging work approach can lead to breaches of procurement law and subsidies controls, or recruitment processes (depending on whether expressions of interest are corporate or individual). I also wonder what is the rush with all of this and what sort of record-keeping will be kept of all this so that it there is adequate accountability of this expenditure. What is the rush?

Or rather, I know that the rush is simply politically-driven and that this is another way in which public funds are at risk for the wrong reasons. But for the entirely arbitrary deadline of the ‘world AI safety summit’ the PM wants to host in the UK in the Fall — preferably ahead of any general election, I would think — it is almost impossible to justify the change of gear between the ‘do nothing’ AI white paper and the ‘rush everything’ approach driving the Taskforce. I hope we will not end up in another set of enquiries and reports, such as those stemming from the PPE procurement scandal or the ventilator challenge, but it is hard to see how this can all be done in a legally compliant manner, and with the serenity. clarity of view and long-term thinking required of regulatory design. Even in the field of AI. Unavoidably, more to follow.

ChatGPT in the Public Sector -- should it be banned?

In ‘ChatGPT in the Public Sector – overhyped or overlooked?’ (24 Apr 2023), the Analysis and Research Team (ART) of the General Secretariat of the Council of the European Union provides a useful and accessible explanation of how ChatGPT works, as well interesting analysis of the risks and pitfalls of rushing to embed generative artificial intelligence (GenAI), and large language models (LLMs) in particular, in the functioning of the public administration.

The analysis stresses the risks stemming from ‘inaccurate, biased, or nonsensical’ GenAI outputs and, in particular, that ‘the key principles of public administration such as accountability, transparency, impartiality, or reliability need to be considered thoroughly in the [GenAI] integration process’.

The paper provides a helpful introduction to how LLMs work and their technical limitations. It then maps potential uses in the public administration, assesses the potential impact of their use on the European principles of public sector administration, and then suggests some measures to mitigate the relevant risks.

This analysis is helpful but, in my view, it is already captured by the presumption that LLMs are here to stay and that what regulators can do is just try to minimise their potential negative impacts—which implies accepting that there will remain unaddressed impacts. By referring to general principles of public administration, rather than eg the right to good administration under the EU Charter of Fundamental Rights, the analysis is also unnecessarily lenient.

I find this type of discourse dangerous and troubling because it facilitates the adoption of digital technologies that cannot meet current legal requirements and guarantees of individual rights. This is clear from the paper itself, although the implications of part of the analysis are not sufficiently explored, in my view.

The paper has a final section where it explicitly recognises that, while some risks might be mitigated by technological advancements, other risks are of a more structural nature and cannot be fully corrected despite best efforts. The paper then lists a very worrying panoply of such structural issues (at 16):

  • ‘This is the case for detecting and removing biases in training data and model outputs. Efforts to sanitize datasets can even worsen biases’.

  • ‘Related to biases is the risk of a perpetuation of the status quo. LLMs mirror the values, habits and attitudes that are present in their training data, which does not leave much space for changing or underrepresented societal views. Relying on LLMs that have been trained with previously produced documents in a public administration severely limits the scope for improvement and innovation and risks leaving the public sector even less flexible than it is already perceived to be’.

  • ‘The ‘black box’ issue, where AI models arrive at conclusions or decisions without revealing the process of how they were reached is also primarily structural’.

  • ‘Regulating new technologies will remain a cat-and-mouse game. Acceleration risk (the emergence of a race to deploy new AI as quickly as possible at the expense of safety standards) is also an area of concern’.

  • ‘Finally […] a major structural risk lies in overreliance, which may be bolstered by rapid technological advances. This could lead to a lack of critical thinking skills needed to adequately assess and oversee the model’s output, especially amongst a younger generation entering a workforce where such models are already being used’.

In my view, beyond the paper’s suggestion that the way forward is to maintain human involvement to monitor the way LLMs (mal)function in the public sector, we should be discussing the imposition of a ban on the adoption of LLMs (and other digital technologies) by the public sector unless it can be positively proven that their deployment will not affect individual rights and more diffuse public interests, and that any residual risks are adequately mitigated.

The current state of affairs is unacceptable in that the lack of regulation allows for a quickly accelerating accumulation of digital deployments that generate risks to social and individual rights and goods. The need to reverse this situation underlies my proposal to permission the adoption of digital technologies by the public sector. Unless we take a robust approach to slowing down and carefully considering the implications of public sector digitalisation, we may be undermining public governance in ways that will be very difficult or impossible to undo. It is not too late, but it may be soon.

Source: https://www.thetimes.co.uk/article/how-we-...

Free registration open for two events on procurement and artificial intelligence

Registration is now open for two free events on procurement and artificial intelligence (AI).

First, a webinar where I will be participating in discussions on the role of procurement in contributing to the public sector’s acquisition of trustworthy AI, and the associated challenges, from an EU and US perspective.

Second, a public lecture where I will present the findings of my research project on digital technologies and public procurement.

Please scroll down for details and links to registration pages. All welcome!

1. ‘Can Procurement Be Used to Effectively Regulate AI?’ | Free online webinar
30 May 2023 2pm BST / 3pm CET-SAST / 9am EST (90 mins)
Co-organised by University of Bristol Law School and George Washington University Law School.

Artificial Intelligence (“AI”) regulation and governance is a global challenge that is starting to generate different responses in the EU, US, and other jurisdictions. Such responses are, however, rather tentative and politically contested. A full regulatory system will take time to crystallise and be fully operational. In the meantime, despite this regulatory gap, the public sector is quickly adopting AI solutions for a wide range of activities and public services.

This process of accelerated AI adoption by the public sector places procurement as the (involuntary) gatekeeper, tasked with ‘AI regulation by contract’, at least for now. The procurement function is expected to design tender procedures and contracts capable of attaining goals of AI regulation (such as trustworthiness, explainability, or compliance with data protection and human and fundamental rights) that are so far eluding more general regulation.

This webinar will provide an opportunity to take a hard look at the likely effectiveness of AI regulation by contract through procurement and its implications for the commercialisation of public governance, focusing on key issues such as:

  • The interaction between tender design, technical standards, and negotiations.

  • The challenges of designing, monitoring, and enforcing contractual clauses capable of delivering effective ‘regulation by contract’ in the AI space.

  • The tension between the commercial value of tailored contractual design and the regulatory value of default clauses and standard terms.

  • The role of procurement disputes and litigation in shaping AI regulation by contract.

  • The alternative regulatory option of establishing mandatory prior approval by an independent regulator of projects involving AI adoption by the public sector.

This webinar will be of interest to those working on or researching the digitalisation of the public sector and AI regulation in general, as the discussion around procurement gatekeeping mirrors the main issues arising from broader trends.

I will have the great opportunity of discussing my research with Aris Georgopoulos (Nottingham), Scott Simpson (Digital Transformation Lead at U.S. Department of Homeland Security), and Liz Chirico (Acquisition Innovation Lead at Office of the Deputy Assistant Secretary of the Army). Jessica Tillipman (GW Law) will moderate the discussion and Q&A.

Registration: https://law-gwu-edu.zoom.us/webinar/register/WN_w_V9s_liSiKrLX9N-krrWQ.

2. ‘AI in the public sector: can procurement promote trustworthy AI and avoid commercial capture?’ | Free in-person public lecture
4 July 2023 2pm BST, Reception Room, Wills Memorial Building, University of Bristol
Organised by University of Bristol Law School, Centre for Global Law and Innovation

The public sector is quickly adopting artificial intelligence (AI) to manage its interactions with citizens and in the provision of public services – for example, using chatbots in official websites, automated processes and call-centres, or predictive algorithms.

There are inherent high stakes risks to this process of public governance digitalisation, such as bias and discrimination, unethical deployment, data and privacy risks, cyber security risks, or risks of technological debt and dependency on proprietary solutions developed by (big) tech companies.

However, as part of the UK Government’s ‘light touch’ ‘pro-innovation’ approach to digital technology regulation, the adoption of AI in the public sector remains largely unregulated. 

In this public lecture, I will present the findings of my research funded by the British Academy, analysing how, in this deregulatory context, the existing rules on public procurement fall short of protecting the public interest.

An alternative approach is required to create mechanisms of external independent oversight and mandatory standards to embed trustworthy AI requirements and to mitigate against commercial capture in the acquisition of AI solutions. 

Registration: https://www.eventbrite.co.uk/e/can-procurement-promote-trustworthy-ai-and-avoid-commercial-capture-tickets-601212712407.

UK's 'pro-innovation approach' to AI regulation won't do, particularly for public sector digitalisation

Regulating artificial intelligence (AI) has become the challenge of the time. This is a crucial area of regulatory development and there are increasing calls—including from those driving the development of AI—for robust regulatory and governance systems. In this context, more details have now emerged on the UK’s approach to AI regulation.

Swimming against the tide, and seeking to diverge from the EU’s regulatory agenda and the EU AI Act, the UK announced a light-touch ‘pro-innovation approach’ in its July 2022 AI regulation policy paper. In March 2023, the same approach was supported by a Report of the Government Chief Scientific Adviser (the ‘GCSA Report’), and is now further developed in the White Paper ‘AI regulation: a pro-innovation approach’ (the ‘AI WP’). The UK Government has launched a public consultation that will run until 21 June 2023.

Given the relevance of the issue, it can be expected that the public consultation will attract a large volume of submissions, and that the ‘pro-innovation approach’ will be heavily criticised. Indeed, there is an on-going preparatory Parliamentary Inquiry on the Governance of AI that has already collected a wealth of evidence exploring the pros and cons of the regulatory approach outlined there. Moreover, initial reactions eg by the Public Law Project, the Ada Lovelace Institute, or the Royal Statistical Society have been (to different degrees) critical of the lack of regulatory ambition in the AI WP—while, as could be expected, think tanks closely linked to the development of the policy, such as the Alan Turing Institute, have expressed more positive views.

Whether the regulatory approach will shift as a result of the expected pushback is unclear. However, given that the AI WP follows the same deregulatory approach first suggested in 2018 and is strongly politically/policy entrenched—for the UK Government has self-assessed this approach as ‘world leading’ and claims it will ‘turbocharge economic growth’—it is doubtful that much will necessarily change as a result of the public consultation.

That does not mean we should not engage with the public consultation, but the opposite. In the face of the UK Government’s dereliction of duty, or lack of ideas, it is more important than ever that there is a robust pushback against the deregulatory approach being pursued. Especially in the context of public sector digitalisation and the adoption of AI by the public administration and in the provision of public services, where the Government (unsurprisingly) is unwilling to create regulatory safeguards to protect citizens from its own action.

In this blogpost, I sketch my main areas of concern with the ‘pro-innovation approach’ in the GCSA Report and AI WP, which I will further develop for submission to the public consultation, building on earlier views. Feedback and comments would be gratefully received: a.sanchez-graells@bristol.ac.uk.

The ‘pro-innovation approach’ in the GCSA Report — squaring the circle?

In addition to proposals on the intellectual property (IP) regulation of generative AI, the opening up of public sector data, transport-related, or cyber security interventions, the GCSA Report focuses on ‘core’ regulatory and governance issues. The report stresses that regulatory fragmentation is one of the key challenges, as is the difficulty for the public sector in ‘attracting and retaining individuals with relevant skills and talent in a competitive environment with the private sector, especially those with expertise in AI, data analytics, and responsible data governance‘ (at 5). The report also further hints at the need to boost public sector digital capabilities by stressing that ‘the government and regulators should rapidly build capability and know-how to enable them to positively shape regulatory frameworks at the right time‘ (at 13).

Although the rationale is not very clearly stated, to bridge regulatory fragmentation and facilitate the pooling of digital capabilities from across existing regulators, the report makes a central proposal to create a multi-regulator AI sandbox (at 6-8). The report suggests that it could be convened by the Digital Regulatory Cooperation Forum (DRCF)—which brings together four key regulators (the Information Commissioner’s Office (ICO), Office of Communications (Ofcom), the Competition and Markets Authority (CMA) and the Financial Conduct Authority (FCA))—and that DRCF should look at ways of ‘bringing in other relevant regulators to encourage join up’ (at 7).

The report recommends that the AI sandbox should operate on the basis of a ‘commitment from the participant regulators to make joined-up decisions on regulations or licences at the end of each sandbox process and a clear feedback loop to inform the design or reform of regulatory frameworks based on the insights gathered. Regulators should also collaborate with standards bodies to consider where standards could act as an alternative or underpin outcome-focused regulation’ (at 7).

Therefore, the AI sandbox would not only be multi-regulator, but also encompass (in some way) standard-setting bodies (presumably UK ones only, though), without issues of public-private interaction in decision-making implying the exercise of regulatory public powers, or issues around regulatory capture and risks of commercial determination, being considered at all. The report in general is extremely industry-orientated, eg in stressing in relation to the overarching pacing problem that ‘for emerging digital technologies, the industry view is clear: there is a greater risk from regulating too early’ (at 5), without this being in any way balanced with clear (non-industry) views that the biggest risk is actually in regulating too late and that we are collectively frog-boiling into a ‘runaway AI’ fiasco.

Moreover, confusingly, despite the fact that the sandbox would be hosted by DRCF (of which the ICO is a leading member), the GCSA Report indicates that the AI sandbox ‘could link closely with the ICO sandbox on personal data applications’ (at 8). The fact that the report is itself unclear as to whether eg AI applications with data protection implications should be subjected to one or two sandboxes, or the extent to which the general AI sandbox would need to be integrated with sectoral sandboxes for non-AI regulatory experimentation, already indicates the complexity and dubious practical viability of the suggested approach.

It is also unclear why multiple sector regulators should be involved in any given iteration of a single AI sandbox where there may be no projects within their regulatory remit and expertise. The alternative approach of having an open or rolling AI sandbox mechanism led by a single AI authority, which would then draw expertise and work in collaboration with the relevant sector regulator as appropriate on a per-project basis, seems preferable. While some DRCF members could be expected to have to participate in a majority of sandbox projects (eg CMA and ICO), others would probably have a much less constant presence (eg Ofcom, or certainly the FCA).

Remarkably, despite this recognition of the functional need for a centralised regulatory approach and a single point of contact (primarily for industry’s convenience), the GCSA Report implicitly supports the 2022 AI regulation policy paper’s approach to not creating an overarching cross-sectoral AI regulator. The GCSA Report tries to create a ‘non-institutionalised centralised regulatory function’, nested under DRCF. In practice, however, implementing the recommendation for a single AI sandbox would create the need for the further development of the governance structures of the DRCF (especially if it was to grow by including many other sectoral regulators), or whichever institution ‘hosted it’, or else risk creating a non-institutional AI regulator with the related difficulties in ensuring accountability. This would add a layer of deregulation to the deregulatory effect that the sandbox itself creates (see eg Ranchordas (2021)).

The GCSA Report seems to try to square the circle of regulatory fragmentation by relying on cooperation as a centralising regulatory device, but it does this solely for the industry’s benefit and convenience, without paying any consideration to the future effectiveness of the regulatory framework. This is hard to understand, given the report’s identification of conflicting regulatory constraints, or in its terminology ‘incentives’: ‘The rewards for regulators to take risks and authorise new and innovative products and applications are not clear-cut, and regulators report that they can struggle to trade off the different objectives covered by their mandates. This can include delivery against safety, competition objectives, or consumer and environmental protection, and can lead to regulator behaviour and decisions that prioritise further minimising risk over supporting innovation and investment. There needs to be an appropriate balance between the assessment of risk and benefit’ (at 5).

This not only frames risk-minimisation as a negative regulatory outcome (and further feeds into the narrative that precautionary regulatory approaches are somehow not legitimate because they run against industry goals—which deserves strong pushback, see eg Kaminski (2022)), but also shows a main gap in the report’s proposal for the single AI sandbox. If each regulator has conflicting constraints, what evidence (if any) is there that collaborative decision-making will reduce, rather than exacerbate, such regulatory clashes? Are decisions meant to be arrived at by majority voting or in any other way expected to deactivate (some or most) regulatory requirements in view of (perceived) gains in relation to other regulatory goals? Why has there been no consideration of eg the problems encountered by concurrency mechanisms in the application of sectoral and competition rules (see eg Dunne (2014), (2020) and (2021)), as an obvious and immediate precedent of the same type of regulatory coordination problems?

The GCSA report also seems to assume that collaboration through the AI sandbox would be resource neutral for participating regulators, whereas it seems reasonable to presume that this additional layer of regulation (even if not institutionalised) would require further resources. And, in any case, there does not seem to be much consideration as to the viability of asking of resource-strapped regulators to create an AI sandbox where they can (easily) be out-skilled and over-powered by industry participants.

In my view, the GCSA Report already points at significant weaknesses in the resistance to creating any new authorities, despite the obvious functional need for centralised regulation, which is one of the main weaknesses, or the single biggest weakness, in the AI WP—as well as in relation to a lack of strategic planning around public sector digital capabilities, despite well-recognised challenges (see eg Committee of Public Accounts (2021)).

The ‘pro-innovation approach’ in the AI WP — a regulatory blackhole, privatisation of ai regulation, or both

The AI WP envisages an ‘innovative approach to AI regulation [that] uses a principles-based framework for regulators to interpret and apply to AI within their remits’ (para 36). It expects the framework to ‘pro-innovation, proportionate, trustworthy, adaptable, clear and collaborative’ (para 37). As will become clear, however, such ‘innovative approach’ solely amounts to the formulation of high-level, broad, open-textured and incommensurable principles to inform a soft law push to the development of regulatory practices aligned with such principles in a highly fragmented and incomplete regulatory landscape.

The regulatory framework would be built on four planks (para 38): [i] an AI definition (paras 39-42); [ii] a context-specific approach (ie a ‘used-based’ approach, rather than a ‘technology-led’ approach, see paras 45-47); [iii] a set of cross-sectoral principles to guide regulator responses to AI risks and opportunities (paras 48-54); and [iv] new central functions to support regulators to deliver the AI regulatory framework (paras 70-73). In reality, though, there will be only two ‘pillars’ of the regulatory framework and they do not involve any new institutions or rules. The AI WP vision thus largely seems to be that AI can be regulated in the UK in a world-leading manner without doing anything much at all.

AI Definition

The UK’s definition of AI will trigger substantive discussions, especially as it seeks to build it around ‘the two characteristics that generate the need for a bespoke regulatory response’: ‘adaptivity’ and ‘autonomy’ (para 39). Discussing the definitional issue is beyond the scope of this post but, on the specific identification of the ‘autonomy’ of AI, it is worth highlighting that this is an arguably flawed regulatory approach to AI (see Soh (2023)).

No new institutions

The AI WP makes clear that the UK Government has no plans to create any new AI regulator, either with a cross-sectoral (eg general AI authority) or sectoral remit (eg an ‘AI in the public sector authority’, as I advocate for). The Ministerial Foreword to the AI WP already stresses that ‘[t]o ensure our regulatory framework is effective, we will leverage the expertise of our world class regulators. They understand the risks in their sectors and are best placed to take a proportionate approach to regulating AI’ (at p2). The AI WP further stresses that ‘[c]reating a new AI-specific, cross-sector regulator would introduce complexity and confusion, undermining and likely conflicting with the work of our existing expert regulators’ (para 47). This however seems to presume that a new cross-sector AI regulator would be unable to coordinate with existing regulators, despite the institutional architecture of the regulatory framework foreseen in the AI WP entirely relying on inter-regulator collaboration (!).

No new rules

There will also not be new legislation underpinning regulatory activity, although the Government claims that the WP AI, ‘alongside empowering regulators to take a lead, [is] also setting expectations‘ (at p3). The AI WP claims to develop a regulatory framework underpinned by five principles to guide and inform the responsible development and use of AI in all sectors of the economy: [i] Safety, security and robustness; [ii] Appropriate transparency and explainability; [iii] Fairness; [iv] Accountability and governance; and [v] Contestability and redress (para 10). However, they will not be put on a statutory footing (initially); ‘the principles will be issued on a non-statutory basis and implemented by existing regulators’ (para 11). While there is some detail on the intended meaning of these principles (see para 52 and Annex A), the principles necessarily lack precision and, worse, there is a conflation of the principles with other (existing) regulatory requirements.

For example, it is surprising that the AI WP describes fairness as implying that ‘AI systems should (sic) not undermine the legal rights of individuals or organisations, discriminate unfairly against individuals or create unfair market outcomes‘ (emphasis added), and stresses the expectation ‘that regulators’ interpretations of fairness will include consideration of compliance with relevant law and regulation’ (para 52). This encapsulates the risks that principles-based AI regulation ends up eroding compliance with and enforcement of current statutory obligations. A principle of AI fairness cannot modify or exclude existing legal obligations, and it should not risk doing so either.

Moreover, the AI WP suggests that, even if the principles are supported by a statutory duty for regulators to have regard to them, ‘while the duty to have due regard would require regulators to demonstrate that they had taken account of the principles, it may be the case that not every regulator will need to introduce measures to implement every principle’ (para 58). This conflates two issues. On the one hand, the need for activity subjected to regulatory supervision to comply with all principles and, on the other, the need for a regulator to take corrective action in relation to any of the principles. It should be clear that regulators have a duty to ensure that all principles are complied with in their regulatory remit, which does not seem to entirely or clearly follow from the weaker duty to have due regard to the principles.

perpetuating regulatory gaps, in particular regarding public sector digitalisation

As a consequence of the lack of creation of new regulators and the absence of new legislation, it is unclear whether the ‘regulatory strategy’ in the AI WP will have any real world effects within existing regulatory frameworks, especially as the most ambitious intervention is to create ‘a statutory duty on regulators requiring them to have due regard to the principles’ (para 12)—but the Government may decide not to introduce it if ‘monitoring of the effectiveness of the initial, non-statutory framework suggests that a statutory duty is unnecessary‘ (para 59).

However, what is already clear that there is no new AI regulation in the horizon despite the fact that the AI WP recognises that ‘some AI risks arise across, or in the gaps between, existing regulatory remits‘ (para 27), that ‘there may be AI-related risks that do not clearly fall within the remits of the UK’s existing regulators’ (para 64), and the obvious and worrying existence of high risks to fundamental rights and values (para 4 and paras 22-25). The AI WP is naïve, to say the least, in setting out that ‘[w]here prioritised risks fall within a gap in the legal landscape, regulators will need to collaborate with government to identify potential actions. This may include identifying iterations to the framework such as changes to regulators’ remits, updates to the Regulators’ Code, or additional legislative intervention’ (para 65).

Hoping that such risk identification and gap analysis will take place without assigning specific responsibility for it—and seeking to exempt the Government from such responsibility—seems a bit too much to ask. In fact, this is at odds with the graphic depiction of how the AI WP expects the system to operate. As noted in (1) in the graph below, it is clear that the identification of risks that are cross-cutting or new (unregulated) risks that warrant intervention is assigned to a ‘central risk function’ (more below), not the regulators. Importantly, the AI WP indicates that such central function ‘will be provided from within government’ (para 15 and below). Which then raises two questions: (a) who will have the responsibility to proactively screen for such risks, if anyone, and (b) how has the Government not already taken action to close the gaps it recognises exists in the current legal landscape?

AI WP Figure 2: Central risks function activities.

This perpetuates the current regulatory gaps, in particular in sectors without a regulator or with regulators with very narrow mandates—such as the public sector and, to a large extent, public services. Importantly, this approach does not create any prohibition of impermissible AI uses, nor sets any (workable) set of minimum requirements for the deployment of AI in high-risk uses, specially in the public sector. The contrast with the EU AI Act could not be starker and, in this aspect in particular, UK citizens should be very worried that the UK Government is not committing to any safeguards in the way technology can be used in eg determining access to public services, or by the law enforcement and judicial system. More generally, it is very worrying that the AI WP does not foresee any safeguards in relation to the quickly accelerating digitalisation of the public sector.

Loose central coordination leading to ai regulation privatisation

Remarkably, and in a similar functional disconnect as that of the GCSA Report (above), the decision not to create any new regulator/s (para 15) is taken in the same breath as the AI WP recognises that the small coordination layer within the regulatory architecture proposed in the 2022 AI regulation policy paper (ie, largely, the approach underpinning the DRCF) has been heavily criticised (para 13). The AI WP recognises that ‘the DRCF was not created to support the delivery of all the functions we have identified or the implementation of our proposed regulatory framework for AI’ (para 74).

The AI WP also stresses how ‘[w]hile some regulators already work together to ensure regulatory coherence for AI through formal networks like the AI and digital regulations service in the health sector and the Digital Regulation Cooperation Forum (DRCF), other regulators have limited capacity and access to AI expertise. This creates the risk of inconsistent enforcement across regulators. There is also a risk that some regulators could begin to dominate and interpret the scope of their remit or role more broadly than may have been intended in order to fill perceived gaps in a way that increases incoherence and uncertainty’ (para 29), which points at a strong functional need for a centralised approach to AI regulation.

To try and mitigate those regulatory risks and shortcomings, the AI WP proposes the creation of ‘a number of central support functions’, such as [i} a central monitoring function of overall regulatory framework’s effectiveness and the implementation of the principles; [ii] central risk monitoring and assessment; [iii] horizon scanning; [iv] supporting testbeds and sandboxes; [v] advocacy, education and awareness-raising initiatives; or [vi] promoting interoperability with international regulatory frameworks (para 14, see also para 73). Cryptically, the AI WP indicates that ‘central support functions will initially be provided from within government but will leverage existing activities and expertise from across the broader economy’ (para 15). Quite how this can be effectively done outwith a clearly defined, adequately resourced and durable institutional framework is anybody’s guess. In fact, the AI WP recognises that this approach ‘needs to evolve’ and that Government needs to understand how ‘existing regulatory forums could be expanded to include the full range of regulators‘, what ‘additional expertise government may need’, and the ‘most effective way to convene input from across industry and consumers to ensure a broad range of opinions‘ (para 77).

While the creation of a regulator seems a rather obvious answer to all these questions, the AI WP has rejected it in unequivocal terms. Is the AI WP a U-turn waiting to happen? Is the mention that ‘[a]s we enter a new phase we will review the role of the AI Council and consider how best to engage expertise to support the implementation of the regulatory framework’ (para 78) a placeholder for an imminent project to rejig the AI Council and turn it into an AI regulator? What is the place and role of the Office for AI and the Centre for Data Ethics and Innovation in all this?

Moreover, the AI WP indicates that the ‘proposed framework is aligned with, and supplemented by, a variety of tools for trustworthy AI, such as assurance techniques, voluntary guidance and technical standards. Government will promote the use of such tools’ (para 16). Relatedly, the AI WP relies on those mechanisms to avoid addressing issues of accountability across AI life cycle, indicating that ‘[t]ools for trustworthy AI like assurance techniques and technical standards can support supply chain risk management. These tools can also drive the uptake and adoption of AI by building justified trust in these systems, giving users confidence that key AI-related risks have been identified, addressed and mitigated across the supply chain’ (para 84). Those tools are discussed in much more detail in part 4 of the AI WP (paras 106 ff). Annex A also creates a backdoor for technical standards to directly become the operationalisation of the general principles on which the regulatory framework is based, by explicitly identifying standards regulators may want to consider ‘to clarify regulatory guidance and support the implementation of risk treatment measures’.

This approach to the offloading of tricky regulatory issues to the emergence of private-sector led standards is simply an exercise in the transfer of regulatory power to those setting such standards, guidance and assurance techniques and, ultimately, a privatisation of AI regulation.

A different approach to sandboxes and testbeds?

The Government will take forward the GCSA recommendation to establish a regulatory sandbox for AI, which ‘will bring together regulators to support innovators directly and help them get their products to market. The sandbox will also enable us to understand how regulation interacts with new technologies and refine this interaction where necessary’ (p2). This thus is bound to hardwire some of the issues mentioned above in relation to the GCSA proposal, as well as being reflective of the general pro-industry approach of the AI WP, which is obvious in the framing that the regulators are expected to ‘support innovators directly and help them get their products to market’. Industrial policy seems to be shoehorned and mainstreamed across all areas of regulatory activity, at least in relation to AI (but it can then easily bleed into non-AI-related regulatory activities).

While the AI WP indicates the commitment to implement the AI sandbox recommended in the GCSA Report, it is by no means clear that the implementation will be in the way proposed in the report (ie a multi-regulator sandbox nested under DRCF, with an expectation that it would develop a crucial coordination and regulatory centralisation effect). The AI WP indicates that the Government still has to explore ‘what service focus would be most useful to industry’ in relation to AI sandboxes (para 96), but it sets out the intention to ‘focus an initial pilot on a single sector, multiple regulator sandbox’ (para 97), which diverges from the approach in the GCSA Report, which would be that of a sandbox for ‘multiple sectors, multiple regulators’. While the public consultation intends to gather feedback on which industry sector is the most appropriate, I would bet that the financial services sector will be chosen and that the ‘regulatory innovation’ will simply result in some closer cooperation between the ICO and FCA.

Regulator capabilities — ai regulation on a shoestring?

The AI WP turns to the issue of regulator capabilities and stresses that ‘While our approach does not currently involve or anticipate extending any regulator’s remit, regulating AI uses effectively will require many of our regulators to acquire new skills and expertise’ (para 102), and that the Government has ‘identified potential capability gaps among many, but not all, regulators’ (para 103).

To try to (start to) address this fundamental issue in the context of a devolved and decentralised regulatory framework, the AI WP indicates that the Government will explore, for example, whether it is ‘appropriate to establish a common pool of expertise that could establish best practice for supporting innovation through regulatory approaches and make it easier for regulators to work with each other on common issues. An alternative approach would be to explore and facilitate collaborative initiatives between regulators – including, where appropriate, further supporting existing initiatives such as the DRCF – to share skills and expertise’ (para 105).

While the creation of ‘common regulatory capacity’ has been advocated by the Alan Turing Institute, and while this (or inter-regulator secondments, for example) could be a short term fix, it seems that this tries to address the obvious challenge of adequately resourcing regulatory bodies without a medium and long-term strategy to build up the digital capability of the public sector, and to perpetuate the current approach to AI regulation on a shoestring. The governance and organisational implications arising from the creation of common pool of expertise need careful consideration, in particular as some of the likely dysfunctionalities are only marginally smaller than current over-reliance on external consultants, or the ‘salami-slicing’ approach to regulatory and policy interventions that seems to bleed from the ’agile’ management of technological projects into the realm of regulatory activity, which however requires institutional memory and the embedding of knowledge and expertise.

Some further thoughts on setting procurement up to fail in 'AI regulation by contract'

The next bit of my reseach project concerns the leveraging of procurement to achieve ‘AI regulation by contract’ (ie to ensure in the use of AI by the public sector: trustworthiness, safety, explainability, human rights compliance, legality especially in data protection terms, ethical use, etc), so I have been thinking about it for the last few weeks to build on my previous views (see here).

In this post, I summarise my further thoughts — which have been prompted by the rich submissions to the House of Commons Science and Technology Committee [ongoing] inquiry on the ‘Governance of Artificial Intelligence’.

Let’s do it via procurement

As a starting point, it is worth stressing that the (perhaps unsurprising) increasingly generalised position is that procurement has a key role to play in regulating the adoption of digital technologies (and AI in particular) by the public sector—which consolidates procurement’s gatekeeping role in this regulatory space (see here).

More precisely, the generalised view is not that procurement ought to play such a role, but that it can do so (effectively and meaningfully). ‘AI regulation by contract’ via procurement is seen as an (easily?) actionable policy and governance mechanism despite the more generalised reluctance and difficulties in regulating AI through general legislative and policy measures, and in creating adequate governance architectures (more below).

This is very clear in several submissions to the ongoing Parliamentary inquiry (above). Without seeking to be exhaustive (I have read most, but not all submissions yet), the following points have been made in written submissions (liberally grouped by topics):

Procurement as (soft) AI regulation by contract & ‘Market leadership’

  • Procurement processes can act as a form of soft regulation Government should use its purchasing power in the market to set procurement requirements that ensure private companies developing AI for the public sector address public standards. ’ (Committee on Standards in Public Life, at [25]-[26], emphasis added).

  • For public sector AI projects, two specific strategies could be adopted [to regulate AI use]. The first … is the use of strategic procurement. This approach utilises government funding to drive change in how AI is built and implemented, which can lead to positive spill-over effects in the industry’ (Oxford Internet Institute, at 5, emphasis added).

  • Responsible AI Licences (“RAILs”) utilise the well-established mechanisms of software and technology licensing to promote self-governance within the AI sector. RAILs allow developers, researchers, and companies to publish AI innovations while specifying restrictions on the use of source code, data, and models. These restrictions can refer to high-level restrictions (e.g., prohibiting uses that would discriminate against any individual) as well as application-specific restrictions (e.g., prohibiting the use of a facial recognition system without consent) … The adoption of such licenses for AI systems funded by public procurement and publicly-funded AI research will help support a pro-innovation culture that acknowledges the unique governance challenges posed by emerging AI technologies’ (Trustworthy Autonomous Systems Hub, at 4, emphasis added).

Procurement and AI explainability

  • public bodies will need to consider explainability in the early stages of AI design and development, and during the procurement process, where requirements for transparency could be stipulated in tenders and contracts’ (Committee on Standards in Public Life, at [17], emphasis added).

  • In the absence of strong regulations, the public sector may use strategic procurement to promote equitable and transparent AI … mandating various criteria in procurement announcements and specifying design criteria, including explainability and interpretability requirements. In addition, clear documentation on the function of a proposed AI system, the data used and an explanation of how it works can help. Beyond this, an approved vendor list for AI procurement in the public sector is useful, to which vendors that agree to meet the defined transparency and explainability requirements may be added’ (Oxford Internet Institute, at 2, referring to K McBride et al (2021) ‘Towards a Systematic Understanding on the Challenges of Procuring Artificial Intelligence in the Public Sector’, emphasis added).

Procurement and AI ethics

  • For example, procurement processes should be designed so products and services that facilitate high standards are preferred and companies that prioritise ethical practices are rewarded. As part of the commissioning process, the government should set out the ethical principles expected of companies providing AI services to the public sector. Adherence to ethical standards should be given an appropriate weighting as part of the evaluation process, and companies that show a commitment to them should be scored more highly than those that do not (Committee on Standards in Public Life, at [26], emphasis added).

Procurement and algorithmic transparency

  • … unlike public bodies, the private sector is not bound by the same safeguards – such as the Public Sector Equality Duty within the Equality Act 2010 (EA) – and is able to shield itself from criticisms regarding transparency behind the veil of ‘commercial sensitivity’. In addition to considering the private company’s purpose, AI governance itself must cover the private as well as public sphere, and be regulated to the same, if not a higher standard. This could include strict procurement rules – for example that private companies need to release certain information to the end user/public, and independent auditing of AI systems’ (Liberty, at [20]).

  • … it is important that public sector agencies are duly empowered to inspect the technologies they’re procuring and are not prevented from doing so by the intellectual property rights. Public sector buyers should use their purchasing power to demand access to suppliers’ systems to test and prove their claims about, for example, accuracy and bias’ (BILETA, at 6).

Procurement and technical standards

  • Standards hold an important role in any potential regulatory regime for AI. Standards have the potential to improve transparency and explainability of AI systems to detail data provenance and improve procurement requirements’ (Ada Lovelace Institute, at 10)

  • The speed at which the technology can develop poses a challenge as it is often faster than the development of both regulation and standards. Few mature standards for autonomous systems exist and adoption of emerging standards need to be encouraged through mechanisms such as regulation and procurement, for example by including the requirement to meet certain standards in procurement specification’ (Royal Academy of Engineering, at 8).

Can procurement do it, though?

Implicit in most views about the possibility of using procurement to regulate public sector AI adoption (and to generate broader spillover effects through market-based propagation mechanisms) is an assumption that the public buyer does (or can get to) know and can (fully, or sufficiently) specify the required standards of explainability, transparency, ethical governance, and a myriad other technical requirements (on auditability, documentation, etc) for the use of AI to be in the public interest and fully legally compliant. Or, relatedly, that such standards can (and will) be developed and readily available for the public buyer to effectively refer to and incorporate them into its public contracts.

This is a BIG implicit assumption, at least in relation with non trivial/open-ended proceduralised requirements and in relation to most of the complex issues raised by (advanced) forms of AI deployment. A sobering and persuasive analysis has shown that, at least for some forms of AI (based on neural networks), ‘it appears unlikely that anyone will be able to develop standards to guide development and testing that give us sufficient confidence in the applications’ respect for health and fundamental rights. We can throw risk management systems, monitoring guidelines, and documentation requirements around all we like, but it will not change that simple fact. It may even risk giving us a false sense of confidence’ [H Pouget, ‘The EU’s AI Act Is Barreling Toward AI Standards That Do Not Exist’ (Lawfare.com, 12 Jan 2023)].

Even for less complex AI deployments, the development of standards will be contested and protracted. This not only creates a transient regulatory gap that forces public buyers to ‘figure it out’ by themselves in the meantime, but can well result in a permanent regulatory gap that leaves procurement as the only safeguard (on paper) in the process of AI adoption in the public sector. If more general and specialised processes of standard setting are unlikely to plug that gap quickly or ever, how can public buyers be expected to do otherwise?

seriously, can procurement do it?

Further, as I wrote in my own submission to the Parliamentary inquiry, ‘to effectively regulate by contract, it is at least necessary to have (i) clarity on the content of the obligations to be imposed, (ii) effective enforcement mechanisms, and (iii) public sector capacity to establish, monitor, and enforce those obligations. Given that the aim of regulation by contract would be to ensure that the public sector only adopts trustworthy AI solutions and deploys them in a way that promotes the public interest in compliance with existing standards of protection of fundamental and individual rights, exercising the expected gatekeeping role in this context requires a level of legal, ethical, and digital capability well beyond the requirements of earlier instances of regulation by contract to eg enforce labour standards’ (at [4]).

Even optimistically ignoring the issues above and adopting the presumption that standards will emerge or the public buyer will be able to (eventually) figure it out (so we park requirement (i) for now), and also assuming that the public sector will be able to develop the required level of eg digital capability (so we also park (iii), but see here)), does however not overcome other obstacles to leveraging procurement for ‘AI regulation by contract’. In particular, it does not address the issue of whether there can be effective enforcement mechanisms within the contractual relationship resulting from a procurement process to impose compliance with the required standards (of explainability, transparency, ethical use, non-discrimination, etc).

I approach this issue as the challenge of enforcing not entirely measurable contractual obligations (ie obligations to comply with a contractual standard rather than a contractual rule), and the closest parallel that comes to my mind is the issue of enforcing quality requirements in public contracts, especially in the provision of outsourced or contracted-out public services. This is an issue on which there is a rich literature (on ‘regulation by contract’ or ‘government by contract’).

Quality-related enforcement problems relate to the difficulty of using contract law remedies to address quality shortcomings (other than perhaps price reductions or contractual penalties where those are permissible) that can do little to address the quality issues in themselves. Major quality shortcomings could lead to eg contractual termination, but replacing contractors can be costly and difficult (especially in a technological setting affected by several sources of potential vendor and technology lock in). Other mechanisms, such as leveraging past performance evaluations to eg bar access to future procurements can also do too little too late to control quality within a specific contract.

An illuminating analysis of the ‘problem of quality’ concluded that the ‘structural problem here is that reliable assurance of quality in performance depends ultimately not on contract terms but on trust and non-legal relations. Relations of trust and powerful non-legal sanctions depend upon the establishment of long-term … relations … The need for a governance structure and detailed monitoring in order to achieve co-operation and quality seems to lead towards the creation of conflictual relations between government and external contractors’ [see H Collins, Regulating Contracts (OUP 1999) 314-15].

To me, this raises important questions about the extent to which procurement and public contracts more generally can effectively deliver the expected safeguards and operate as an adequate sytem of ‘AI regulation by contract’. It seems to me that price clawbacks or financial penalties, even debarment decisions, are unilkely to provide an acceptable safety net in some (or most) cases — eg high-risk uses of complex AI. Not least because procurement disputes can take a long time to settle and because the incentives will not always be there to ensure strict enforcement anyway.

More thoughts to come

It seems increasingly clear to me that the expectations around the leveraging of procurement to ‘regulate AI by contract’ need reassessing in view of its likely effectiveness. Such effectiveness is constrained by the rules on the design of tenders for the award of public contracts, as well as those public contracts, and mechanisms to resolve disputes emerging from either tenders or contracts. The effectiveness of this approach is, of course, also constrained by public sector (digital) capability and by the broader difficulties in ascertaining the appropriate approach to (standards-based) AI regulation, which cannot so easily be set aside. I will keep thinking about all this in the process of writing my monograph. If this is of interested, keep an eye on this blog fior further thougths and analysis.

AI regulation by contract: submission to UK Parliament

In October 2022, the Science and Technology Committee of the House of Commons of the UK Parliament (STC Committee) launched an inquiry on the ‘Governance of Artificial Intelligence’. This inquiry follows the publication in July 2022 of the policy paper ‘Establishing a pro-innovation approach to regulating AI’, which outlined the UK Government’s plans for light-touch AI regulation. The inquiry seeks to examine the effectiveness of current AI governance in the UK, and the Government’s proposals that are expected to follow the policy paper and provide more detail. The STC Committee has published 98 pieces of written evidence, including submissions from UK regulators and academics that will make for interesting reading. Below is my submission, focusing on the UK’s approach to ‘AI regulation by contract’.

A. Introduction

01. This submission addresses two of the questions formulated by the House of Commons Science and Technology Committee in its inquiry on the ‘Governance of artificial intelligence (AI)’. In particular:

  • How should the use of AI be regulated, and which body or bodies should provide regulatory oversight?

  • To what extent is the legal framework for the use of AI, especially in making decisions, fit for purpose?

    • Is more legislation or better guidance required?

02. This submission focuses on the process of AI adoption in the public sector and, particularly, on the acquisition of AI solutions. It evidences how the UK is consolidating an inadequate approach to ‘AI regulation by contract’ through public procurement. Given the level of abstraction and generality of the current guidelines for AI procurement, major gaps in public sector digital capabilities, and potential structural conflicts of interest, procurement is currently an inadequate tool to govern the process of AI adoption in the public sector. Flanking initiatives, such as the pilot algorithmic transparency standard, are unable to address and mitigate governance risks. Contrary to the approach in the AI Regulation Policy Paper,[1] plugging the regulatory gap will require (i) new legislation supported by a new mechanism of external oversight and enforcement (an ‘AI in the Public Sector Authority’ (AIPSA)); (ii) a well-funded strategy to boost in-house public sector digital capabilities; and (iii) the introduction of a (temporary) mechanism of authorisation of AI deployment in the public sector. The Procurement Bill would not suffice to address the governance shortcomings identified in this submission.

B. ‘AI Regulation by Contract’ through Procurement

03. Unless the public sector develops AI solutions in-house, which is extremely rare, the adoption of AI technologies in the public sector requires a procurement procedure leading to their acquisition. This places procurement at the frontline of AI governance because the ‘rules governing the acquisition of algorithmic systems by governments and public agencies are an important point of intervention in ensuring their accountable use’.[2] In that vein, the Committee on Standards in Public Life stressed that the ‘Government should use its purchasing power in the market to set procurement requirements that ensure that private companies developing AI solutions for the public sector appropriately address public standards. This should be achieved by ensuring provisions for ethical standards are considered early in the procurement process and explicitly written into tenders and contractual arrangements’.[3] Procurement is thus erected as a public interest gatekeeper in the process of adoption of AI by the public sector.

04. However, to effectively regulate by contract, it is at least necessary to have (i) clarity on the content of the obligations to be imposed, (ii) effective enforcement mechanisms, and (iii) public sector capacity to establish, monitor, and enforce those obligations. Given that the aim of regulation by contract would be to ensure that the public sector only adopts trustworthy AI solutions and deploys them in a way that promotes the public interest in compliance with existing standards of protection of fundamental and individual rights, exercising the expected gatekeeping role in this context requires a level of legal, ethical, and digital capability well beyond the requirements of earlier instances of regulation by contract to eg enforce labour standards.

05. On a superficial reading, it could seem that the National AI Strategy tackled this by highlighting the importance of the public sector’s role as a buyer and stressing that the Government had already taken steps ‘to inform and empower buyers in the public sector, helping them to evaluate suppliers, then confidently and responsibly procure AI technologies for the benefit of citizens’.[4] The National AI Strategy referred, in particular, to the setting up of the Crown Commercial Service’s AI procurement framework (the ‘CCS AI Framework’),[5] and the adoption of the Guidelines for AI procurement (the ‘Guidelines’)[6] as enabling tools. However, a close look at these instruments will show their inadequacy to provide clarity on the content of procedural and contractual obligations aimed at ensuring the goals stated above (para 03), as well as their potential to widen the existing public sector digital capability gap. Ultimately, they do not enable procurement to carry out the expected gatekeeping role.

C. Guidelines and Framework for AI procurement

06. Despite setting out to ‘provide a set of guiding principles on how to buy AI technology, as well as insights on tackling challenges that may arise during procurement’, the Guidelines provide high-level recommendations that cannot be directly operationalised by inexperienced public buyers and/or those with limited digital capabilities. For example, the recommendation to ‘Try to address flaws and potential bias within your data before you go to market and/or have a plan for dealing with data issues if you cannot rectify them yourself’ (guideline 3) not only requires a thorough understanding of eg the Data Ethics Framework[7] and the Guide to using Artificial Intelligence in the public sector,[8] but also detailed insights on data hazards.[9] This leads the Guidelines to stress that it may be necessary ‘to seek out specific expertise to support this; data architects and data scientists should lead this process … to understand the complexities, completeness and limitations of the data … available’.

07. Relatedly, some of the recommendations are very open ended in areas without clear standards. For example, the effectiveness of the recommendation to ‘Conduct initial AI impact assessments at the start of the procurement process, and ensure that your interim findings inform the procurement. Be sure to revisit the assessments at key decision points’ (guideline 4) is dependent on the robustness of such impact assessments. However, the Guidelines provide no further detail on how to carry out such assessments, other than a list of some generic areas for consideration (eg ‘potential unintended consequences’) and a passing reference to emerging guidelines in other jurisdictions. This is problematic, as the development of algorithmic impact assessments is still at an experimental stage,[10] and emerging evidence shows vastly diverging approaches, eg to risk identification.[11] In the absence of clear standards, algorithmic impact assessments will lead to inconsistent approaches and varying levels of robustness. The absence of standards will also require access to specialist expertise to design and carry out the assessments.

08. Ultimately, understanding and operationalising the Guidelines requires advanced digital competency, including in areas where best practices and industry standards are still developing.[12] However, most procurement organisations lack such expertise, as a reflection of broader digital skills shortages across the public sector,[13] with recent reports placing civil service vacancies for data and tech roles throughout the civil service alone close to 4,000.[14] This not only reduces the practical value of the Guidelines to facilitate responsible AI procurement by inexperienced buyers with limited capabilities, but also highlights the role of the CCS AI Framework for AI adoption in the public sector.

09. The CCS AI Framework creates a procurement vehicle[15] to facilitate public buyers’ access to digital capabilities. CCS’ description for public buyers stresses that ‘If you are new to AI you will be able to procure services through a discovery phase, to get an understanding of AI and how it can benefit your organisation.’[16] The Framework thus seeks to enable contracting authorities, especially those lacking in-house expertise, to carry out AI procurement with the support of external providers. While this can foster the uptake of AI in the public sector in the short term, it is highly unlikely to result in adequate governance of AI procurement, as this approach focuses at most on the initial stages of AI adoption but can hardly be sustainable throughout the lifecycle of AI use in the public sector—and, crucially, would leave the enforcement of contractualised AI governance obligations in a particularly weak position (thus failing to meet the enforcement requirement at para 04). Moreover, it would generate a series of governance shortcomings which avoidance requires an alternative approach.

D. Governance Shortcomings

10. Despite claims to the contrary in the National AI Strategy (above para 05), the approach currently followed by the Government does not empower public buyers to responsibly procure AI. The Guidelines are not susceptible of operationalisation by inexperienced public buyers with limited digital capabilities (above paras 06-08). At the same time, the Guidelines are too generic to support sophisticated approaches by more advanced digital buyers. The Guidelines do not reduce the uncertainty and complexity of procuring AI and do not include any guidance on eg how to design public contracts to perform the regulatory functions expected under the ‘AI regulation by contract’ approach.[17] This is despite existing recommendations on eg the development of ‘model contracts and framework agreements for public sector procurement to incorporate a set of minimum standards around ethical use of AI, with particular focus on expected levels transparency and explainability, and ongoing testing for fairness’.[18] The guidelines thus fail to address the first requirement for effective regulation by contract in relation to clarifying the relevant obligations (para 04).

11. The CCS Framework would also fail to ensure the development of public sector capacity to establish, monitor, and enforce AI governance obligations (para 04). Perhaps counterintuitively, the CCS AI Framework can generate a further disempowerment of public buyers seeking to rely on external capabilities to support AI adoption. There is evidence that reliance on outside providers and consultants to cover immediate needs further erodes public sector capability in the long term,[19] as well as creating risks of technical and intellectual debt in the deployment of AI solutions as consultants come and go and there is no capture of institutional knowledge and memory.[20] This can also exacerbate current trends of pilot AI graveyard spirals, where most projects do not reach full deployment, at least in part due to insufficient digital capabilities beyond the (outsourced) pilot phase. This tends to result in self-reinforcing institutional weaknesses that can limit the public sector’s ability to drive digitalisation, not least because technical debt quickly becomes a significant barrier.[21] It also runs counter to best practices towards building public sector digital maturity,[22] and to the growing consensus that public sector digitalisation first and foremost requires a prioritised investment in building up in-house capabilities.[23] On this point, it is important to note the large size of the CCS AI Framework, which was initially pre-advertised with a £90 mn value,[24] but this was then revised to £200 mn over 42 months.[25] Procuring AI consultancy services under the Framework can thus facilitate the funnelling of significant amounts of public funds to the private sector, rather than using those funds to build in-house capabilities. It can result in multiple public buyers entering contracts for the same expertise, which thus duplicates costs, as well as in a cumulative lack of institutional learning by the public sector because of atomised and uncoordinated contractual relationships.

12. Beyond the issue of institutional dependency on external capabilities, the cumulative effect of the Guidelines and the Framework would be to outsource the role of ‘AI regulation by contract’ to unaccountable private providers that can then introduce their own biases on the substantive and procedural obligations to be embedded in the relevant contracts—which would ultimately negate the effectiveness of the regulatory approach as a public interest safeguard. The lack of accountability of external providers would not only result from the weakness (or absolute inability) of the public buyer to control their activities and challenge important decisions—eg on data governance, or algorithmic impact assessments, as above (paras 06-07)—but also from the potential absence of effective and timely external checks. Market mechanisms are unlikely to deliver adequate checks due market concentration and structural conflicts of interest affecting both providers that sometimes provide consultancy services and other times are involved in the development and deployment of AI solutions,[26] as well as a result of insufficiently effective safeguards on conflicts of interest resulting from quickly revolving doors. Equally, broader governance controls are unlikely to be facilitated by flanking initiatives, such as the pilot algorithmic transparency standard.

13. To try to foster accountability in the adoption of AI by the public sector, the UK is currently piloting an algorithmic transparency standard.[27] While the initial six examples of algorithmic disclosures published by the Government provide some details on emerging AI use cases and the data and types of algorithms used by publishing organisations, and while this information could in principle foster accountability, there are two primary shortcomings. First, completing the documentation requires resources and, in some respects, advanced digital capabilities. Organisations participating in the pilot are being supported by the Government, which makes it difficult to assess to what extent public buyers would generally be able to adequately prepare the documentation on their own. Moreover, the documentation also refers to some underlying requirements, such as algorithmic impact assessments, that are not yet standardised (para 07). In that, the pilot standard replicates the same shortcomings discussed above in relation to the Guidelines. Algorithmic disclosure will thus only be done by entities with high capabilities, or it will be outsourced to consultants (thus reducing the scope for the revelation of governance-relevant information).

14. Second, compliance with the standard is not mandatory—at least while the pilot is developed. If compliance with the algorithmic transparency standard remains voluntary, there are clear governance risks. It is easy to see how precisely the most problematic uses may not be the object of adequate disclosures under a voluntary self-reporting mechanism. More generally, even if the standard was made mandatory, it would be necessary to implement an external quality control mechanism to mitigate problems with the quality of self-reported disclosures that are pervasive in other areas of information-based governance.[28] Whether the Central Digital and Data Office (currently in charge of the pilot) would have capacity (and powers) to do so remains unclear, and it would in any case lack independence.

15. Finally, it should be stressed that the current approach to transparency disclosure following the adoption of AI (ex post) can be problematic where the implementation of the AI is difficult to undo and/or the effects of malicious or risky AI are high stakes or impossible to revert. It is also problematic in that the current approach places the burden of scrutiny and accountability outside the public sector, rather than establishing internal, preventative (ex ante) controls on the deployment of AI technologies that could potentially be very harmful for fundamental and individual socio-economic rights—as evidenced by the inclusion of some fields of application of AI in the public sector as ‘high risk’ in the EU’s proposed EU AI Act.[29] Given the particular risks that AI deployment in the public sector poses to fundamental and individual rights, the minimalistic and reactive approach outlined in the AI Regulation Policy Paper is inadequate.

E. Conclusion: An Alternative Approach

16. Ensuring that the adoption of AI in the public sector operates in the public interest and for the benefit of all citizens will require new legislation supported by a new mechanism of external oversight and enforcement. New legislation is required to impose specific minimum requirements of eg data governance and algorithmic impact assessment and related transparency across the public sector. Such legislation would then need to be developed in statutory guidance of a much more detailed and actionable nature than the current Guidelines. These developed requirements can then be embedded into public contracts by reference. Without such clarification of the relevant substantive obligations, the approach to ‘AI regulation by contract’ can hardly be effective other than in exceptional cases.

17. Legislation would also be necessary to create an independent authority—eg an ‘AI in the Public Sector Authority’ (AIPSA)—with powers to enforce those minimum requirements across the public sector. AIPSA is necessary, as oversight of the use of AI in the public sector does not currently fall within the scope of any specific sectoral regulator and the general regulators (such as the Information Commissioner’s Office) lack procurement-specific knowledge. Moreover, units within Cabinet Office (such as the Office for AI or the Central Digital and Data Office) lack the required independence.

18. It would also be necessary to develop a clear and sustainably funded strategy to build in-house capability in the public sector, including clear policies on the minimisation of expenditure directed at the engagement of external consultants and the development of guidance on how to ensure the capture and retention of the knowledge developed within outsourced projects (including, but not only, through detailed technical documentation).

19. Until sufficient in-house capability is built to ensure adequate understanding and ability to manage digital procurement governance requirements independently, the current reactive approach should be abandoned, and AIPSA should have to approve all projects to develop, procure and deploy AI in the public sector to ensure that they meet the required legislative safeguards in terms of data governance, impact assessment, etc. This approach could progressively be relaxed through eg block exemption mechanisms, once there is sufficiently detailed understanding and guidance on specific AI use cases and/or in relation to public sector entities that could demonstrate sufficient in-house capability, eg through a mechanism of independent certification.

20. The new legislation and statutory guidance would need to be self-standing, as the Procurement Bill would not provide the required governance improvements. First, the Procurement Bill pays limited to no attention to artificial intelligence and the digitalisation of procurement.[30] An amendment (46) that would have created minimum requirements on automated decision-making and data ethics was not moved at the Lords Committee stage, and it seems unlikely to be taken up again at later stages of the legislative process. Second, even if the Procurement Bill created minimum substantive requirements, it would lack adequate enforcement mechanisms, not least due to the limited powers and lack of independence of the foreseen Procurement Review Unit (to also sit within Cabinet Office).

_______________________________________
Note: all websites last accessed on 25 October 2022.

[1] Department for Digital, Culture, Media and Sport, Establishing a pro-innovation approach to regulating AI. An overview of the UK’s emerging approach (CP 728, 2022).

[2] Ada Lovelace Institute, AI Now Institute and Open Government Partnership, Algorithmic Accountability for the Public Sector (August 2021) 33.

[3] Committee on Standards in Public Life, Intelligence and Public Standards (2020) 51.

[4] Department for Digital, Culture, Media and Sport, National AI Strategy (CP 525, 2021) 47.

[5] AI Dynamic Purchasing System < https://www.crowncommercial.gov.uk/agreements/RM6200 >.

[6] Office for Artificial Intelligence, Guidelines for AI Procurement (2020) < https://www.gov.uk/government/publications/guidelines-for-ai-procurement/guidelines-for-ai-procurement >.

[7] Central Digital and Data Office, Data Ethics Framework (Guidance) (2020) < https://www.gov.uk/government/publications/data-ethics-framework >.

[8] Central Digital and Data Office, A guide to using artificial intelligence in the public sector (2019) < https://www.gov.uk/government/collections/a-guide-to-using-artificial-intelligence-in-the-public-sector >.

[9] See eg < https://datahazards.com/index.html >.

[10] Ada Lovelace Institute, Algorithmic impact assessment: a case study in healthcare (2022) < https://www.adalovelaceinstitute.org/report/algorithmic-impact-assessment-case-study-healthcare/ >.

[11] A Sanchez-Graells, ‘Algorithmic Transparency: Some Thoughts On UK's First Four Published Disclosures and the Standards’ Usability’ (2022) < https://www.howtocrackanut.com/blog/2022/7/11/algorithmic-transparency-some-thoughts-on-uk-first-disclosures-and-usability >.

[12] A Sanchez-Graells, ‘“Experimental” WEF/UK Guidelines for AI Procurement: Some Comments’ (2019) < https://www.howtocrackanut.com/blog/2019/9/25/wef-guidelines-for-ai-procurement-and-uk-pilot-some-comments >.

[13] See eg Public Accounts Committee, Challenges in implementing digital change (HC 2021-22, 637).

[14] S Klovig Skelton, ‘Public sector aims to close digital skills gap with private sector’ (Computer Weekly, 4 Oct 2022) < https://www.computerweekly.com/news/252525692/Public-sector-aims-to-close-digital-skills-gap-with-private-sector >.

[15] It is a dynamic purchasing system, or a list of pre-screened potential vendors public buyers can use to carry out their own simplified mini-competitions for the award of AI-related contracts.

[16] Above (n 5).

[17] This contrasts with eg the EU project to develop standard contractual clauses for the procurement of AI by public organisations. See < https://living-in.eu/groups/solutions/ai-procurement >.

[18] Centre for Data Ethics and Innovation, Review into bias in algorithmic decision-making (2020) < https://www.gov.uk/government/publications/cdei-publishes-review-into-bias-in-algorithmic-decision-making/main-report-cdei-review-into-bias-in-algorithmic-decision-making >.

[19] V Weghmann and K Sankey, Hollowed out: The growing impact of consultancies in public administrations (2022) < https://www.epsu.org/sites/default/files/article/files/EPSU%20Report%20Outsourcing%20state_EN.pdf >.

[20] A Sanchez-Graells, ‘Identifying Emerging Risks in Digital Procurement Governance’ in idem, Digital Technologies and Public Procurement. Gatekeeping and experimentation in digital public governance (OUP, forthcoming) < https://ssrn.com/abstract=4254931 >.

[21] M E Nielsen and C Østergaard Madsen, ‘Stakeholder influence on technical debt management in the public sector: An embedded case study’ (2022) 39 Government Information Quarterly 101706.

[22] See eg Kevin C Desouza, ‘Artificial Intelligence in the Public Sector: A Maturity Model’ (2021) IBM Centre for the Business of Government < https://www.businessofgovernment.org/report/artificial-intelligence-public-sector-maturity-model >.

[23] A Clarke and S Boots, A Guide to Reforming Information Technology Procurement in the Government of Canada (2022) < https://govcanadacontracts.ca/it-procurement-guide/ >.

[24] < https://ted.europa.eu/udl?uri=TED:NOTICE:600328-2019:HTML:EN:HTML&tabId=1&tabLang=en >.

[25] < https://ted.europa.eu/udl?uri=TED:NOTICE:373610-2020:HTML:EN:HTML&tabId=1&tabLang=en >.

[26] See S Boots, ‘“Charbonneau Loops” and government IT contracting’ (2022) < https://sboots.ca/2022/10/12/charbonneau-loops-and-government-it-contracting/ >.

[27] Central Digital and Data Office, Algorithmic Transparency Standard (2022) < https://www.gov.uk/government/collections/algorithmic-transparency-standard >.

[28] Eg in the context of financial markets, there have been notorious ongoing problems with ensuring adequate quality in corporate and investor disclosures.

[29] < https://artificialintelligenceact.eu/ >.

[30] P Telles, ‘The lack of automation ideas in the UK Gov Green Paper on procurement reform’ (2021) < http://www.telles.eu/blog/2021/1/13/the-lack-of-automation-ideas-in-the-uk-gov-green-paper-on-procurement-reform >.

Interesting legislative proposal to make procurement of AI conditional on external checks

Procurement is progressively put in the position of regulating what types of artificial intelligence (AI) are deployed by the public sector (ie taking a gatekeeping function; see here and here). This implies that the procurement function should be able to verify that the intended AI (and its use/foreseeable misuse) will not cause harms—or, where harms are unavoidable, come up with a system to weigh, and if appropriate/possible manage, that risk. I am currently trying to understand the governance implications of this emerging gatekeeping role to assess whether procurement is best placed to carry it out.

In the context of this reflection, I found a very useful recent paper: M E Kaminski, ‘Regulating the Risks of AI’ (2023) 103 Boston University Law Review forthcoming. In addition to providing a useful critique of the treatment of AI harms as risk and of the implications in terms of the regulatory baggage that (different types of) risk regulation implies, Kaminski provides an overview of a very interesting legislative proposal: Washington State’s Bill SB 5116.

Bill SB 5116 is a proposal for new legislation ‘establishing guidelines for government procurement and use of automated decision systems in order to protect consumers, improve transparency, and create more market predictability'. The governance approach underpinning the Bill is interesting in two respects.

First, the Bill includes a ban on certain uses of AI in the public sector. As Kaminski summarises: ‘Sec. 4 of SB 5116 bans public agencies from engaging in (1) the use of an automated decision system that discriminates, (2) the use of an “automated final decision system” to “make a decision impacting the constitutional or legal rights… of any Washington resident” (3) the use of an “automated final decision system…to deploy or trigger any weapon;” (4) the installation in certain public places of equipment that enables AI-enabled profiling, (5) the use of AI-enabled profiling “to make decisions that produce legal effects or similarly significant effects concerning individuals’ (at 66, fn 398).

Second, the Bill subjects the procurement of the AI to approval by the director of the office of the chief information officer. As Kaminski clarifies: ‘The bill’s assessment process is thus more like a licensing scheme than many proposed impact assessments in that it envisions a central regulator serving a gatekeeping function (albeit probably not an intensive one, and not over private companies, which aren’t covered by the bill at all). In fact, the bill is more protective than the GDPR in that the state CIO must make the algorithmic accountability report public and invite public comment before approving it’ (at 66, references omitted).

What the Bill does, then, is to displace the gatekeeping role from the procurement function itself to the data protection regulator. It also sets the specific substantive criteria the regulator has to apply in deciding whether to authorise the procurement of the AI.

Without getting into the detail of the Washington Bill, this governance approach seems to have two main strengths over the current emerging model of procurement self-regulation of the gatekeeping role (in the EU).

First, it facilitates a standardisation of the substantive criteria to be applied in assessing the potential harms resulting from AI adoption in the public sector, with a concentration on the specific characteristics of decision-making in this context. Importantly, it creates a clear area of illegality. Some of it is in line with eg the prohibition of certain AI uses in the Draft EU AI Act (profiling), or in the GDPR (prohibition of solely automated individual-decision making, including profiling — although it may go beyond it). Moreover, such an approach would allow for an expansion of prohibited uses in the specific context of the public sector, which the EU AI Act mostly fails to tackle (see here). It would also allow for the specification of constraints applicable to the use of AI by the public sector, such as a heightened obligation to provide reasons (see M Fink & M Finck, ‘Reasoned A(I)dministration: Explanation Requirements in EU Law and the Automation of Public Administration‘ (2022) 47(3) European Law Review 376-392).

Second, it introduces an element of external (independent) verification of the assessment of potential AI harms. I think this is a crucial governance point because most proposals relying on the internal (self) assessment by the procurement team fail to consider the extent to which such approach ensures (a) adequate resourcing (eg specialism and experience in the type of assessment) and (b) sufficient objectivity in the assessment. On the second point, with procurement teams often being told to ‘just go and procure what is needed’, moving to a position of gatekeeper or controller could be too big an ask (depending on institutional aspects that require closer consideration). Moreover, this would be different from other aspects of gatekeeping that procurement has progressively been asked to carry out (also excessively, in my view: see here).

When the procurement function is asked to screen for eg potential contractors’ social or environmental compliance track record, it is usually at arms’ length from those being reviewed (and the rules on conflict of interest are there to strengthen that position). Conversely, when the procurement function is asked to screen for the likely impact on citizens and/or users of public services of an initiative promoted by the operational part of the organisation to which it belongs, things are much more complicated.

That is why some systems (like the US FAR) create elements of separation between the procurement team and those in charge of reviewing eg competition issues (by means of the competition advocate). This is a model reflected in the Washington Bill’s approach to requiring external (even if within the public administration) verification and approval of the AI impact assessment. If procurement is to become a properly functioning gatekeeper of the adoption of AI by the public sector, this regulatory approach (ie having an ‘AI Harms Controller’) seems promising. Definitely a model worth thinking about for a little longer.

Algorithmic transparency: some thoughts on UK's first four published disclosures and the standards' usability

© Fabrice Jazbinsek / Flickr.

The Algorithmic Transparency Standard (ATS) is one of the UK’s flagship initiatives for the regulation of public sector use of artificial intelligence (AI). The ATS encourages (but does not mandate) public sector entities to fill in a template to provide information about the algorithmic tools they use, and why they use them [see e.g. Kingsman et al (2022) for an accessible overview].

The ATS is currently being piloted, and has so far resulted in the publication of four disclosures relating to the use of algorithms in different parts of the UK’s public sector. In this post, I offer some thoughts based on these initial four disclosures, in particular from the perspective of the usability of the ATS in facilitating an enhanced understanding of AI use cases, and accountability for those.

The first four disclosed AI use cases

The ATS pilot has so far published information in two batches (on 1 June and 6 July 2022), comprising the following four AI use cases:

  1. Within Cabinet Office, the GOV.UK Data Labs team piloted the ATS for their Related Links tool; a recommendation engine built to aid navigation of GOV.UK (the primary UK central government website) by providing relevant onward journeys from a content page, with the aim of helping users find useful information and content, aiding navigation.

  2. In the Department for Health and Social Care and NHS Digital, the QCovid team piloted the ATS with a COVID-19 clinical tool used to predict how at risk individuals might be from COVID-19. The tool was developed for use by clinicians in support of conversations with patients about personal risk, and it uses algorithms to combine a number of factors such as age, sex, ethnicity, height and weight (to calculate BMI), and specific health conditions and treatments in order to estimate the combined risk of catching coronavirus and being hospitalised or catching coronavirus and dying. Importantly, “The original version of the QCovid algorithms were also used as part of the Population Risk Assessment to add patients to the Shielded Patient List in February 2021. These patients were advised to shield at that time were provided support for doing so, and were prioritised for COVID-19 vaccination.

  3. The Information Commissioner's Office has piloted the ATS with its Registration Inbox AI, which uses a machine learning algorithm to categorise emails sent to the Information Commissioner's Office’s registration inbox and to send out an auto-reply where the algorithm “detects … a request about changing a business address. In cases where it detects this kind of request, the algorithm sends out an autoreply that directs the customer to a new online service and points out further information required to process a change request. Only emails with an 80% certainty of a change of address request will be sent an email containing the link to the change of address form.”

  4. The Food Standards Agency piloted the ATS with its Food Hygiene Rating Scheme (FHRS) – AI, which is an algorithmic tool to help local authorities to prioritise inspections of food businesses based on their predicted food hygiene rating by predicting which establishments might be at a higher risk of non-compliance with food hygiene regulations. Importantly, the tool is of voluntary use and “it is not intended to replace the current approach to generate a FHRS score. The final score will always be the result of an inspection undertaken by [a local authority] officer.

Harmless (?) use cases

At first glance, and on the basis of the implications of the outcome of the algorithmic recommendation, it would seem that the four use cases are relatively harmless, i.e..

  1. If GOV.UK recommends links to content that is not relevant or helpful, the user may simply ignore them.

  2. The outcome of the QCovid tool simply informs the GPs’ (or other clinicians’) assessment of the risk of their patients, and the GPs’ expertise should mediate any incorrect (either over-inclusive, or under-inclusive) assessments by the AI.

  3. If the ICO sends an automatic email with information on how to change their business address to somebody that had submitted a different query, the receiver can simply ignore that email.

  4. Incorrect or imperfect prioritisation of food businesses for inspection could result in the early inspection of a low-risk restaurant, or the late(r) inspection of a higher-risk restaurant, but this is already a risk implicit in allowing restaurants to open pending inspection; AI does not add risk.

However, this approach could be too simplistic or optimistic. It can be helpful to think about what could really happen if the AI got it wrong ‘in a disaster scenario’ based on possible user reactions (a useful approach promoted by the Data Hazards project). It seems to me that, on ‘worse case scenario’ thinking (and without seeking to be exhaustive):

  1. If GOV.UK recommends content that is not helpful but is confusing, the user can either engage in red tape they did not need to complete (wasting both their time and public resources) or, worse, feel overwhelmed, confused or misled and abandon the administrative interaction they were initially seeking to complete. This can lead to exclusion from public services, and be particularly problematic if these situations can have a differential impact on different user groups.

  2. There could be over-reliance on the QCovid algorithm by (too busy) GPs. This could lead to advising ‘as a matter of routine’ the taking of excessive precautions with significant potential impacts on the day to day lives of those affected—as was arguably the case for some of the citizens included in shielding categories in the earlier incarnation of the algorithm. Conversely, GPs that identified problems in the early use of the algorithm could simply ignore it, thus potentially losing the benefits of the algorithm in other cases where it could have been helpful—potentially leading to under-precaution by individuals that could have otherwise been better safeguarded.

  3. Similarly to 1, the provision of irrelevant and potentially confusing information can lead to waste of resource (e.g. users seeking to change their business registration address because they wrongly think it is a requirement to process their query or, at a lower end of the scale, users having to read and consider information about an administrative process they have no interest in). Beyond that, the classification algorithm could generate loss of queries if there was no human check to verify that the AI classification was correct. If this check takes place anyway, the advantages of automating the sending of the initial email seem rather marginal.

  4. Similar to 2, the incorrect prediction of risk can lead to misuse of resources in the carrying out of inspections by local authorities, potentially pushing down the list of restaurants pending inspection some that are high-risk and that could thus be seen their inspection repeatedly delayed. This could have important public health implications, at least for those citizens using the to be inspected restaurants for longer than they would otherwise have. Conversely, inaccurate prioritisations that did not seem to catch more ‘risky’ restaurants could also lead to local authorities abandoning its use. There is also a risk of profiling of certain types of businesses (and their owners), which could lead to victimisation if the tool was improperly used, or used in relation to restaurants that have been active for a longer period (eg to trigger fresh (re)inspections).

No AI application is thus entirely harmless. Of course, this is just a matter of theoretical speculation—as could also be speculated whether reduced engagement with the AI would generate a second tier negative effect, eg if ‘learning’ algorithms could not be revised and improved on the basis of ‘real-life’ feedback on whether their predictions were or not accurate.

I think that this sort of speculation offers a useful yardstick to assess the extent to which the ATS can be helpful and usable. I would argue that the ATS will be helpful to the extent that (a) it provides information susceptible of clarifying whether the relevant risks have been taken into account and properly mitigated or, failing that (b) it provides information that can be used to challenge the insufficiency of any underlying risk assessments or mitigation strategies. Ultimately, AI transparency is not an end in itself, but simply a means of increasing accountability—at least in the context of public sector AI adoption. And it is clear that any degree of transparency generated by the ATS will be an improvement on the current situation, but is the ATS really usable?

Finding out more on the basis of the ATS disclosures

To try to answer that general question on whether the ATS is usable and serves to facilitate increased accountability, I have read the four disclosures in full. Here is my summary/extracts of the relevant bits for each of them.

GOV.UK Related Links

Since May 2019, the tool has been using an algorithm called node2vec (machine learning algorithm that learns network node embeddings) to train a model on the last three weeks of user movement data (web analytics data). The benefits are described as “the tool … predicts related links for a page. These related links are helpful to users. They help users find the content they are looking for. They also help a user find tangentially related content to the page they are on; it’s a bit like when you are looking for a book in the library, you might find books that are relevant to you on adjacent shelves.

The way the tool works is described in some more detail: “The tool updates links every three weeks and thus tracks changes in user behaviour.” “Every three weeks, the machine learning algorithm is trained using the last three weeks of analytics data and trains a model that outputs related links that are published, overwriting the existing links with new ones.” “The average click through rate for related links is about 5% of visits to a content page. For context, GOV.UK supports an average of 6 million visits per day (Jan 2022). True volumes are likely higher owing to analytics consent tracking. We only track users who consent to analytics cookies …”.

The decision process is fully automated, but there is “a way for publishers to add/amend or remove a link from the component. On average this happens two or three times a month.” “Humans have the capability to recommend changes to related links on a page. There is a process for links to be amended manually and these changes can persist. These human expert generated links are preferred to those generated by the model and will persist.” Moreover, “GOV.UK has a feedback link, “report a problem with this page”, on every page which allows users to flag incorrect links or links they disagree with.” The tool was subjected to a Data Protection Impact Assessment (DPIA), but no other impact assessments (IAs) are listed.

When it comes to risk identification and mitigation, the disclosure indicates: “A recommendation engine can produce links that could be deemed wrong, useless or insensitive by users (e.g. links that point users towards pages that discuss air accidents).” and that, as mitigation: “We added pages to a deny list that might not be useful for a user (such as the homepage) or might be deemed insensitive (e.g. air accident reports). We also enabled publishers or anyone with access to the tagging system to add/amend or remove links. GOV.UK users can also report problems through the feedback mechanisms on GOV.UK.

Overall, then, the risk I had identified is only superficially identified, in that the ATS disclosure does not show awareness of the potential differing implications of incorrect or useless recommendations across the spectrum. The narrative equating the recommendations to browsing the shelves of a library is quite suggestive in that regard, as is the fact that the quality controls are rather limited.

Indeed, it seems that the quality control mechanisms require a high level of effort by every publisher, as they need to check every three weeks whether the (new) related links appearing in each of the pages they publish are relevant and unproblematic. This seems to have reversed the functional balance of convenience. Before the implementation of the tool, only approximately 2,000 out of 600,000 pieces of content on GOV.UK had related links, as they had to be created manually (and thus, hopefully, were relevant, if not necessarily unproblematic). Now, almost all pages have up to five related content suggestions, but only two or three out of 600,000 pages see their links manually amended per month. A question arises whether this extremely low rate of manual intervention is reflective of the high quality of the system, or the reverse evidence of lack of resource to quality-assure websites that previously prevented 98% of pages from having this type of related information.

However, despite the queries as to the desirability of the AI implementation as described, the ATS disclosure is in itself useful because it allows the type of analysis above and, in case someone considers the situation unsatisfactory or would like to prove it further, there are is a clear gateway to (try to) engage the entity responsible for this AI deployment.

QCovid algorithm

The algorithm was developed at the onset of the Covid-19 pandemic to drive government decisions on which citizens to advise to shield, support during shielding, and prioritise for vaccination rollout. Since the end of the shielding period, the tool has been modified. “The clinical tool for clinicians is intended to support individual conversations with patients about risk. Originally, the goal was to help patients understand the reasons for being asked to shield and, where relevant, help them do so. Since the end of shielding requirements, it is hoped that better-informed conversations about risk will have supported patients to make appropriate decisions about personal risk, either protecting them from adverse health outcomes or to some extent alleviating concerns about re-engaging with society.

In essence, the tool creates a risk calculation based on scoring risk factors across a number of data fields pertaining to demographic, clinical and social patient information.“ “The factors incorporated in the model include age, ethnicity, level of deprivation, obesity, whether someone lived in residential care or was homeless, and a range of existing medical conditions, such as cardiovascular disease, diabetes, respiratory disease and cancer. For the latest clinical tool, separate versions of the QCOVID models were estimated for vaccinated and unvaccinated patients.

It is difficult to assess how intensely the tool is (currently) used, although the ATS indicates that “In the period between 1st January 2022 and 31st March 2022, there were 2,180 completed assessments” and that “Assessment numbers often move with relative infection rate (e.g. higher infection rate leads to more usage of the tool).“ The ATS also stresses that “The use of the tool does not override any clinical decision making but is a supporting device in the decision making process.” “The tool promotes shared decision making with the patient and is an extra point of information to consider in the decision making process. The tool helps with risk/benefit analysis around decisions (e.g. recommendation to shield or take other precautionary measures).

The impact assessment of this tool is driven by those mandated for medical devices. The description is thus rather technical and not very detailed, although the selected examples it includes do capture the possibility of somebody being misidentified “as meeting the threshold for higher risk”, as well as someone not having “an output generated from the COVID-19 Predictive Risk Model”. The ATS does stress that “As part of patient safety risk assessment, Hazardous scenarios are documented, yet haven’t occurred as suitable mitigation is introduced and implemented to alleviate the risk.” That mitigation largely seems to be that “The tool is designed for use by clinicians who are reminded to look through clinical guidance before using the tool.

I think this case shows two things. First, that it is difficult to understand how different parts of the analysis fit together when a tool that has had two very different uses is the object of a single ATS disclosure. There seems to be a good argument for use case specific ATS disclosures, even if the underlying AI deployment is the same (or a closely related one), as the implications of different uses from a governance perspective also differ.

Second, that in the context of AI adoption for healthcare purposes, there is a dual barrier to accessing relevant (and understandable) information: the tech barrier and the medical barrier. While the ATS does something to reduce the former, the latter very much remains in place and perhaps turn the issue of trustworthiness of the AI to trustworthiness of the clinician, which is not necessarily entirely helpful (not only in this specific use case, but in many other one can imagine). In that regard, it seems that the usability of the ATS is partially limited, and more could be done to increase meaningful transparency through AI-specific IAs, perhaps as proposed by the Ada Lovelace Institute.

In this case, the ATS disclosure has also provided some valuable information, but arguably to a lesser extent than the previous case study.

ICO’s Registration Inbox AI

This is a tool that very much resembles other forms of email classification (e.g. spam filters), as “This algorithmic tool has been designed to inspect emails sent to the ICO’s registration inbox and send out autoreplies to requests made about changing addresses. The tool has not been designed to automatically change addresses on the requester’s behalf. The tool has not been designed to categorise other types of requests sent to the inbox.

The disclosure indicates that “In a significant proportion of emails received, a simple redirection to an online service is all that is required. However, sifting these types of emails out would also require time if done by a human. The algorithm helps to sift out some of these types of emails that it can then automatically respond to. This enables greater capacity for [Data Protection] Fees Officers in the registration team, who can, consequently, spend more time on more complex requests.” “There is no manual intervention in the process - the links are provided to the customer in a fully automated manner.

The tool has been in use since May 2021 and classifies approximately 23,000 emails a month.

When it comes to risk identification and mitigation, the ATS disclosure stresses that “The algorithmic tool does not make any decisions, but instead provides links in instances where it has calculated the customer has contacted the ICO about an address change, giving the customer the opportunity to self-serve.” Moreover, it indicates that there is “No need for review or appeal as no decision is being made. Incorrectly classified emails would receive the default response which is an acknowledgement.” It further stresses that “The classification scope is limited to a change of address and a generic response stating that we have received the customer’s request and that it will be processed within an estimated timeframe. Incorrectly classified emails would receive the default response which is an acknowledgement. This will not have an impact on personal data. Only emails with an 80% certainty of a change of address request will be sent an email containing the link to the change of address form.”

In my view, this disclosure does not entirely clarify the way the algorithm works (e.g. what happens to emails classified as having requested information on change of address? Are they ‘deleted’ from the backlog of emails requiring a (human) non-automated response?). However, it does provide sufficient information to further consolidate the questions arising from the general description. For example, it seems that the identification of risks is clearly partial in that there is not only a risk of someone asking for change of address information not automatically receiving it, but also a risk of those asking for other information receiving the wrong information. There is also no consideration of additional risks (as above), and the general description makes the claim of benefits doubtful if there has to be a manual check to verify adequate classification.

The ATS disclosure does not provide sufficient contact information for the owner of the AI (perhaps because they were contracted on limited after service terms…), although there is generic contact information for the ICO that could be used by someone that considered the situation unsatisfactory or would like to prove it further.

Food Hygiene Rating Scheme – AI

This tool is also based on machine learning to make predictions. “A machine learning framework called LightGBM was used to develop the FHRS AI model. This model was trained on data from three sources: internal Food Standards Agency (FSA) FHRS data, publicly available Census data from the 2011 census and open data from HERE API. Using this data, the model is trained to predict the food hygiene rating of an establishment awaiting its first inspection, as well as predicting whether the establishment is compliant or not.” “Utilising the service, the Environmental Health Officers (EHOs) are provided with the AI predictions, which are supplemented with their knowledge about the businesses in the area, to prioritise inspections and update their inspection plan.”

Regarding the justification for the development, the disclosure stresses that “the number of businesses classified as ‘Awaiting Inspection’ on the Food Hygiene Rating Scheme website has increased steadily since the beginning of the pandemic. This has been the key driver behind the development of the FHRS AI use case.” “The objective is to help local authorities become more efficient in managing the hygiene inspection workload in the post-pandemic environment of constrained resources and rapidly evolving business models.

Interestingly, the disclosure states that the tool “has not been released to actual end users as yet and hence the maintenance schedule is something that cannot be determined at this point in time (June 2022). The Alpha pilot started at the beginning of April 2022, wherein the end users (the participating Local Authorities) have access to the FHRS AI service for use in their day-to-day workings. This section will be updated depending on the outcomes of the Alpha Pilot ...” It remains to be seen whether there will be future updates on the disclosure, but an error in copy-pasting in the ATS disclosure makes it contain the same paragraph but dated February 2022. This stresses the need to date and reference (eg v.1, v.2) the successive versions of the same disclosure, which does not seem to be a field of the current template, as well as to create a repository of earlier versions of the same disclosure.

The section on oversight stresses that “the system has been designed to provide decision support to Local Authorities. FSA has advised Local Authorities to never use this system in place of the current inspection regime or use it in isolation without further supporting information”. It also stresses that “Since there will be no change to the current inspection process by introducing the model, the existing appeal and review mechanisms will remain in place. Although the model is used for prioritisation purposes, it should not impact how the establishment is assessed during the inspection and therefore any challenges to a food hygiene rating would be made using the existing FHRS appeal mechanism.”

The disclosure also provides detailed information on IAs: “The different impact assessments conducted during the development of the use case were 1. Responsible AI Risk Assessment; 2. Stakeholder Impact Assessment; [and] 3. Privacy Impact Assessment.” Concerning the responsible AI risk assessment, in addition to a personal data issue that should belong in the DPIA, the disclosure reports three identified risks very much in line with the ones I had hinted at above: “2. Potential bias from the model (e.g. consistently scoring establishments of a certain type much lower, less accurate predictions); 3. Potential bias from inspectors seeing predicted food hygiene ratings and whether the system has classified the establishment as compliant or not. This may have an impact on how the organisation is perceived before receiving a full inspection; 4. With the use of AI/ML there is a chance of decision automation bias or automation distrust bias occurring. Essentially, this refers to a user being over or under reliant on the system leading to a degradation of human-reasoning.”

The disclosure presents related mitigation strategies as follows: “2. Integration of explainability and fairness related tooling during exploration and model development. These tools will also be integrated and monitored post-alpha testing to detect and mitigate potential biases from the system once fully operational; 3. Continuously reflect, act and justify sessions with business and technical subject matter experts throughout the delivery of the project, along with the use of the three impact assessments outlined earlier to identify, assess and manage project risks; 4. Development of usage guidance for local authorities specifically outlining how the service is expected to be used. This document also clearly states how the service should not be used, for example, the model outcome must not be the only indicator used when prioritising businesses for inspection.

In this instance, the ATS disclosure is in itself useful because it allows the type of analysis above and, in case someone considers the situation unsatisfactory or would like to prove it further, there are is a clear gateway to (try to) engage the entity responsible for this AI deployment. It is also interesting to see that the disclosure specifies that the private provider was engaged “As well as [in] a development role [… to provide] Responsible AI consulting and delivery services, including the application of a parallel Responsible AI sprint to assess risk and impact, enable model explainability and assess fairness, using a variety of artefacts, processes and tools”. This is clearly reflected in the ATS disclosure and could be an example of good practice where organisations lack that in-house capability and/or outsource the development of the AI. Whether that role should fall with the developer, or should rather be separate to avoid organisational conflicts of interest is a discussion for another day.

Final thoughts

There seems to be a mixed picture on the usability of the ATS disclosures, with some of them not entirely providing (full) usability, or a clear pathway to engage with the specific entity in charge of the development of the algorithmic tool, specifically if it was an outsourced provider. In those cases, the public authority that has implemented the AI (even if not the owner of the project) will have to deal with any issues arising from the disclosure. There is also a mixed practice concerning linking to resources other than previously available (open) data (eg open source code, data sources), with only one project (GOV.UK) including them in the disclosures discussed above.

It will be interesting to see how this assessment scales up (to use a term) once disclosures increase in volume. There is clearly a research opportunity arising as soon as more ATS disclosures are published. As a hypothesis, I would submit that disclosure quality is likely to reduce with volume, as well as with the withdrawal of whichever support the pilot phase has meant for those participating institutions. Let’s see how that empirical issue can be assessed.

The other reflection I have to offer based on these first four disclosures is that there are points of information in the disclosures that can be useful, at least from an academic (and journalistic?) perspective, to assess the extent to which the public sector has the capabilities it needs to harness digital technologies (more on that soon in this blog).

The four reviewed disclosures show that there was one in-house development (GOV.UK), while the other ones were either procured (QCovid, which disclosure includes a redacted copy of the contract), or contracted out, perhaps even directly awarded (ICO email classifier FSA FHRS - AI). And there are some in between the line indications that some of the implementations may have been relatively randomly developed, unless there was strong pre-existing reliable statistical data (eg on information requests concerning change of business address). Which in itself triggers questions on the procurement or commissioning strategy developed by institutions seeking to harness AI potential.

From this perspective, the ATS disclosures can be a useful source of information on the extent to which the adoption of AI by the public sector depends as strongly on third party capabilities as the literature generally hypothesises or/and is starting to demonstrate empirically.

The importance of procurement for public sector AI uptake

In case there was any question on the importance and central role of public procurement for the uptake of artificial intelligence (AI) by the public sector (there wasn’t, though), two recent policy reports confirm that this is the case, at the very least in the European context.

AI Watch’s must-read ‘European landscape on the use of Artificial Intelligence by the Public Sector’ (1 June 2022) makes the point very clearly by reference to the analysis of AI strategies adopted by 24 EU Member States: ‘the procurement of AI technologies or the increased collaboration with innovative private partners is seen as an important way to facilitate the introduction of AI within the public sector. Guidance on how to stimulate and organise AI procurement by civil servants should potentially be strengthened and shared among Member States’ (at 26). Concerning guidance, the report refers to the European Commission’s supported process of developing standard contractual clauses for the procurement of AI (see here), and there is also a twin AI Watch Handbook for the adoption of AI by the public sector (25 May 2022) that includes a recommendation on procurement guidance (‘Promote the development of multilingual guidelines, criteria and tools for public procurement of AI solutions in the public sector throughout Europe‘, recommendation 2.5, and details at 34-35).

The European landscape report provides some more interesting detail on national strategies considering AI procurement adaptations.

The need to work together with the private sector in this area is repeatedly stressed. However, strategies mention that historically it has been difficult for innovative companies to work together with government authorities due to cumbersome procurement regulations. In this area, several strategies (12, 50%) [though note the table below indicates 13, rather than 12 strategies] come up with new policy initiatives to improve the procurement processes. The Spanish strategy, for example, mentions that new innovative public procurement mechanisms will be introduced to help the procurement of new solutions from the market, while the Maltese government describes how existing public procurement processes will be changed to facilitate the procurement of emerging technologies such as AI. The Dutch and Czech strategies mention that hackathons for public sector AI will be introduced to assist in the procurement of AI. Civil servants will be given training and awareness in procurement to assist them in this process, something that is highlighted in the Estonian strategy. The French strategy stresses that current procurement regulation already provides a lot of freedom for innovative procurement but that because of risk aversion present within public administrations all possibilities are not taken into consideration (at 25-26, emphasis in the original).

Own elaboration, based on Table 7 in the AI Watch report.

There is also an interesting point on the need to create internal public sector AI capabilities: “Some strategies say that the public organisations should work more together with private organisations (where the missing skillsets are present), either through partnerships or by procurement. On the one hand, this is an extremely important and promising shift in the public sector that more and more must move towards a networking perspective. In fact, the complexity and variety of skills required by AI cannot be always completely internalised. On the other hand, such partnerships and procurement still require a baseline in expertise in AI within the public sector staff to avoid common mistakes or dependency on external parties” (at 23, emphasis added).

Given the strategic importance of procurement, as well as the need to upskill the procurement workforce and to build additional AI capacity in the public sector to manage procurement process, this is an area of research and policy that will only increase in relevance in the near and longer term.

This same direction of travel is reflected in the also recent UK's Central Digital and Data Office ‘Transforming for a digital future: 2022 to 2025 roadmap for digital and data’ (9 June 2022). One of its main aspirations is to generate ‘Significant savings by leveraging government’s combined purchasing power and reducing duplicative procurement, to shift to a “buy once, use many times” approach to technology’. This should be achieved by the horizontal promotion of ‘a “buy once, use many times” approach to technology, including by making use of a common code, pattern and architecture repository for government’. Implicitly, this will also require a review of procurement policies and practices.

Importantly—and potentially problematically—it will also raise the stakes of AI procurement, in particular if the roll-out of the ‘bought once’ technology is rushed and its negative impacts or implications can only be identified once it has already been propagated, or in relation to some implementations only. Avoiding this will require very careful IA impact assessments, as well as piloting and scalability approaches that have strong risk-management systems embedded by design.

As always, this will be an area fun to keep an eye on.