Summarizing Bills With Generative AI

So far in 2023, thousands of bills have been introduced in jurisdictions across the country. The sooner people understand them, the sooner they can start advocating for what they care about. Since bills can be long and complex, it’s often helpful to have a summary of the changes the bill is proposing.

Let’s take SB 396, “To create the social media safety act; to require age verification for use of social media; and to clarify liability for failure to perform age verification for use of social media and illegal retention of data.” This bill, from Arkansas’s 2023 Session, regulates minors’ use of social media, like verifying users’ age and requiring minors to have parental consent. The bill is pages long with technical jargon and multiple textual changes. In this case, as for most Arkansas bills, the release does not include a legislature provided summary.

Given the advancements of Artificial Intelligence in language tasks, we believe Language Models can help us understand bills. Here’s an AI summary for SB 396:

“In SB 396, the bill proposes the creation of the Social Media Safety Act, which requires social media companies to verify the age of account holders, obtain parental consent for minors, and use reasonable age verification methods, which may include obtaining digitized identification or using other commercially reasonable age verification methods. The bill specifies that social media companies that knowingly violate the proposed act will be liable for damages resulting from a minor accessing a social media platform. The bill also outlines liability for commercial entities and third-party vendors related to the retention of identifying information after access to a social media platform has been granted.”

Summarizing Bill Texts

Language Models (LM) are specific generative models trained with big amounts of unstructured text data, with the goal of predicting the next word or the words adjacent inside the text corpus. With enough training data, number of parameters and computing power, these models can perform language tasks at human level and even maintain logical conversations. Language models are at the base of technologies such as ChatGPT, Bard, and others.

Given the current capabilities of LM and generative AI, we decided to try using it to summarize bills. There’s ample evidence of using LLM to summarize new articles, emails, contracts, and even short stories. Longer texts such as books still present a challenge. Given the success in other fields, we decided to analyze bill texts using common summarization techniques.

But it’s not a trivial task! Our first round of testing revealed some barriers and rough edges to this approach. Here are a few examples of our first round of AI generated summaries, with corresponding Policy Expert Evaluations:

As you can see, while the model is able to summarize some bills successfully, there are other cases where the model confuses statute and bill text, giving an incorrect summary of the proposed bill.

View original bill texts here:

GET STARTED WITH PLURAL

Summarizing Bill Changes

Using AI and LM to summarize bills may be more complex than it seems. Since bills are mainly proposed changes to current statutes, if we pass the proposed text to a summarization model, it is often not able to differentiate between current and proposed law. However, if we pass only the changes in text, the model won’t have the necessary context to understand what the bill is proposing to change — for instance, modifying the minimum wage from $12 to $15 may be a single character change.

After additional research and help from our policy experts, we were able to find a combination of models and prompts that correctly identify most of the changes, with a better understanding of the current statutes and proposed law. Although it’s not perfect, the performance is significantly better.

Next Steps: Utilizing AI to Improve Policy Analysis

Despite improved performance, there are still hurdles to overcome. LMs can hallucinate, confidently share wrong answers, and even give biased responses. LMs also have a hard time analyzing very long texts, so large omnibus bills may be out of the question for now. Finally, there’s a lot of room for improvement in the model and prompting area besides this experiment.

While limitations exist, it’s likely that LMs and generative AI can support and improve policy analysis, as it has been shown in other text based fields. While this remains a hard problem, at Plural, we’re committed to keep exploring the edges of the intersection between AI and public policy.

GET STARTED WITH PLURAL