Government agencies will feel pressure to quickly deliver AI-powered services to the public as Congress recommends both guardrails and a “full speed ahead” mindset for federal artificial intelligence adoption. It’s going to happen. But how do you know that bots are not harmful and do not put individual team members, organizations, or citizens at risk?
Government agencies have a duty to provide accurate information to the public, and malicious bots can have legal and moral repercussions. For example, last year, the IRS was cited by the Comptroller’s Office for using AI to flag tax returns for audit after finding that the technology may contain unintended bias. It was done. Although the IRS had human involvement with the system, no other guidance from executive orders or other directives appears to have been in place at the time the potential bias was discovered.
The IRS case is a reminder of how important it is for government agencies to take every step possible to avoid risks to the public and protect government and personal data before risks materialize. Let me do it. It may sound daunting, but federal guidance and frameworks help you understand AI risks, run DevOps and DevSecOps teams concurrently, and ensure your models deliver the highest quality results. It emphasizes what is needed, including the creation of an independent red team. How to do this is not so clear. However, relying on best practices already defined across data security and software development provides the clear path needed to ensure that AI does not pose risks.
put risk at the forefront
Validating AI can be difficult, as many AI models make a trade-off between accuracy and explainability, but it is necessary to reduce risk. Start by asking the questions quality assurance (QA) would ask about your application. What is the risk of failure? What are the potential consequences of that failure? What potential output can your AI system produce? Who can it present it to? What impact could it have?
))>
A risk-based approach to application development is not new, but it needs to be enhanced for AI. Many teams are accustomed to simply creating or purchasing software that meets their requirements. Additionally, DevOps processes have quality and security testing built into the process from the beginning. However, simply applying current QA processes is the wrong approach, as AI requires close consideration of how a system can “malfunction” from its intended use. If the AI makes a mistake, you can’t simply patch it.
adopt an adversarial mindset
Red teams are regularly deployed to discover weaknesses in systems and should be used to test AI, but not in the same way as in traditional application development. The AI Red Team must be separated from the day-to-day development team and its successes and failures.
A government AI red team should include in-house engineers and ethicists, participants from government-owned research labs, and ideally trusted external consultants, but not the ones who build the software or benefit from it. No one is going to receive it. Individuals need to understand how AI systems are likely to impact not only their citizens, but also the broader technology infrastructure in which they are deployed.
AI red teams must take an adversarial mindset to identify harmful or discriminatory outputs from AI systems and unexpected or undesired system behavior. Particular attention should also be paid to the limitations and potential risks associated with the misuse of AI systems.
Red teams need to be freed from the pressure of release timing and political expectations and report to a leader outside of the development and implementation teams, perhaps the chief AI officer (CAIO). This helps ensure the effectiveness of AI models and aligns them with appropriate guardrails.
Rethinking the verification-to-development ratio
Advances in AI have greatly increased efficiency. Chatbots that might have taken months to build can now be created in just days.
Don’t assume that AI tests can be completed just as quickly. Proper validation of AI systems is multifaceted, and the ratio of test time to development time should be closer to 70% to 80% for AI, rather than the typical 35% to 50% for enterprise software. Much of this rise is driven by the fact that requirements often come into sharp focus during testing, making the cycle more of an “iterative development mini-cycle” than a traditional “testing” cycle. DevOps teams must take time to review liability such as training data, privacy violations, bias, error conditions, intrusion attempts, data leaks, and the potential for AI output to make false or misleading statements. Additionally, the red team must allocate its own time to make the system malfunction.
Formulation of AI data guidelines
Government agencies should establish guidelines for what data should and should not be used to train AI systems. When using internal data, agencies must maintain a registry of data and notify data producers that their data will be used to train AI models. Guidelines should be specific to each unique use case.
))>
AI models don’t internally partition data like databases do, so data trained from one source can potentially be accessed by different user accounts. When agencies use sensitive data to train AI models, they should consider adopting a “one model per sensitive domain” policy. This is likely to be the case for most government implementations.
Be transparent about AI output
AI developers need to communicate what content and recommendations are being generated by their AI systems. For example, when an agency’s customers interact with a chatbot, they need to be made aware that the content was generated by AI.
Similarly, if an AI system generates content such as documents or images, authorities may be required to maintain a registry of those assets so that they can later be verified as “authentic.” There is. Such assets may also require digital watermarking. Although this is not yet a requirement, many government agencies have already adopted this best practice.
Government agencies must continually monitor, red team, refine, and validate their models to ensure they work as intended and provide accurate, unbiased information. By prioritizing independence, integrity, and transparency, the model built today gives foundations the capabilities they need to improve their operations and serve the public while maintaining public safety and privacy. Provide.
David Colwell is Vice President of Artificial Intelligence and Machine Learning at Tricentis, a provider of automated software testing solutions designed to accelerate application delivery and digital transformation.
Copyright © 2024 Federal News Network. Unauthorized reproduction is prohibited. This website is not directed to users within the European Economic Area.