As has happened before with the emergence of other tech developments, the arrival of generative artificial intelligence (AI) tools such as ChatGPT has caused great expectation and a series of false or exaggerated information around them.
For journalists, now is the time to experiment with these models, get to know them and learn how to get the most out of them, instead of forbidding or being afraid of them, according to the panelists of the webinar "Generative AI: What journalists should know about ChatGPT and other tools." The webinar was held virtually on Aug. 17, sponsored by the Knight Center for Journalism in the Americas.
The webinar, which can be viewed again on the Knight Center's YouTube channel, featured Aimee Rinehart, senior product manager AI strategy for the Associated Press’ Local News AI initiative, and Sil Hamilton, a machine learning engineer and AI researcher-in-residence at the journalism organization Hacks/Hackers. The moderator was Marc Lavallee, director of technology products and strategy for journalism at the Knight Foundation.
In the face of the confusion and uncertainty that ChatGPT has caused for the news industry, it is important to learn to distinguish between what technology companies say generative AI is capable of doing and what it can actually do. And the best way for journalists to understand the capabilities and limitations of this technology is to experiment directly with it, and become familiar with its concepts, techniques and processes. In other words, begin a process of artificial intelligence literacy.
“Big Tech came in and told us how the Internet was going to work. And we have abided by all of the rules that they've set up. It's the same thing now: if we don't get in there, if we don't experiment, they will write the rules,” Rinehart said. “I think this is the time for journalists to get in, experiment, figure out where its deficits are, and where its strong suits are. And then learn from that, and create standards around that, because right now we're just saying ‘just don't use it,’ and that's not a good position to be in”.
Lavallee mentioned a survey released in May of this year by the World Association of Newspapers and News Publishers (WAN-IFRA) that found that nearly half of the newsrooms surveyed said they were using generative AI tools and most viewed them in a positive light. However, only 20 percent of those newsrooms had already formulated best practice guidelines and policies around their use.
“We're only at the beginning of understanding how they work. And a big part of that is experimentation, is asking basic questions, and is definitely recalling that we don't know how they work,” Hamilton said. “But we do know what they've been trained on, and especially with journalists and newsroom organizations, it's really important that they not only develop policies for working with AI, but also controlling their own data.”
Rinehart emphasized that the best opportunities with generative AI tools are for local media, since, on the one hand, they have the greatest needs and gaps regarding their technical and human resources, and on the other hand, being small organizations, they have greater margin for error and space to correct errors quickly, compared to large media.
“I think it is their moment right now.“[...] They simply just need more help covering [the news]. They often have a wider geographical area, because so many other newsrooms have closed and they often don't have as many people in the newsroom,” Rinehart said. “So they have more to do with less resources. And to me, that's where AI can really come in and bridge that gap.”
The journalist said the AP began to make inroads with artificial intelligence in 2014, by automating stories based on financial reports from companies. The agency went from analyzing 300 reports manually to 3,000 with artificial intelligence, which allowed it to focus resources and efforts on journalistic projects of greater impact.
Rinehart said that, just as the AP did with the analysis of financial reports, media outlets that want to make inroads with artificial intelligence should start by identifying repetitive tasks in their editorial and business processes. They should also listen to team members and identify the tasks where they feel most pressured or overwhelmed. From there, they can analyze whether they need an automated tool.
These recommendations stem from the framework for procurement of artificial intelligence into newsrooms derived from a 2021 AP survey of more than 200 U.S. news outlets, which Rinehart said will soon be available for consultation.
“I think it's really useful, and can kind of lower the temperature of like, ‘Oh, my gosh, we have to have AI right now,’ because maybe you just need a different workflow. Or maybe you just need a simple process automation to make it all better,” she said.
In terms of text-generative AI tools, the journalist also suggested not limiting oneself to ChatGPT, but trying other available platforms, such as Claude, developed by Anthropic, a tech startup founded by former OpenAI employees; as well as Bard, developed by Google. Ideally, Rinehart said, it's best to run tests on the different platforms and determine which one offers the best results for each task and news outlet.
ChatGPT is an artificial intelligence model based on technology called Generative Pre-trained Transformer, which is designed to generate coherent, human-like text responses in a conversational manner. It was developed by the U.S. artificial intelligence research lab OpenAI.
The first OpenAI pre-trained language model (GPT) was released in 2018. ChatGPT currently runs GPT-4 (the most advanced version so far) in its Plus version, and GPT-3.5, in its free version.
Systems like ChatGPT have been trained with huge amounts of data coming from the internet. However, one of the main criticisms of these systems is their accuracy and the credibility of their sources, as their answers are based on patterns, not facts, Hamilton explained. That is why it is not recommended that journalists use these tools to generate content.
"The good rule of thumb is that these models are very good at form, not content," Hamilton said. "It's great at form, but when you ask it to write from scratch, it falls short on almost everything, and that's in a wide range of areas. [...] It's not good at writing from scratch. So if you want to work with the model, always make sure you give it something to work with."
That's where journalism has a big opportunity, according to the researcher, as news outlets generate high-quality content, and this may be the raw material with which to leverage generative AI. Claude and Bard are able to read different file types, such as PDFs, images and videos. From these, reporters can prompt specific actions such as summarizing, rewriting, translating or asking specific questions about the content.
“AI, we know it works the best when we give it high quality information. And on the internet, newsrooms have been producing generally the highest quality information you can get,” Hamilton said. “Suddenly the model will, instead of relying on its own knowledge base, look at the content that I provided it in the prompt. And this is called ‘in-context learning,’” he later added.
Rinehart explained that systems like ChatGPT start from two bases: the language base and the knowledge base. For journalists, the panelist recommended starting from the language base, since the level of knowledge that these models use is still uncertain in most cases.
The journalist also emphasized that, unlike other technological developments that are created in English, generative AI models have performed well in other languages.
“A safer way for a small newsroom to experiment with this is taking that story that you've written and ask it [AI] to write three headlines,” Rinehart said. “Ask it to do a Twitter thread, or a Facebook post, or a summary of it, depending on where and how you need it to be used. I think that's a really good way to use it.
”Because of the uncertainty of the knowledge base from which they work, ChatGPT and its counterparts are prone to "hallucinate." That is, they produce answers that seem logical but may be inaccurate or incorrect. This is because, Hamilton explained, GPT models understand the meaning of words and how they relate to each other. However, they develop this understanding of words without ever having "seen" what the words represent.
Hence the importance of providing the models with quality information on which to work.
“Hallucinations occur when the model stretches this knowledge a little bit past what it really fundamentally knows, when it tries to confabulate. There's a growing trend of calling hallucinations confabulations, because it's when GPT tries to fill in the blanks of its knowledge when there's gaps in its knowledge, and it doesn't quite understand that something relates in one way to another thing, and it tries to just come up with a way that they might relate.”
Hallucinations are a natural consequence of GPT models’ architecture, Hamilton explained, so journalists should always take information from them as they would take information from any other source and verify it, even when that information seems to make sense.
Among the most recurring questions from the webinar audience were those about the risk of journalists losing their jobs in the face of the growth of generative AI technologies. The panelists agreed that instead of fear, information professionals should really get to know how this technology works and adapt to the new reality.
“The way I’ve been able to stay in this industry for so long is because I've been able to adapt quickly, or at least be interested in what's coming next,” Rinehart said. “I think if you're working in newsrooms and you want to be working in newsrooms five years from now, you should be looking at this space seriously. It is not a fad, it's not going to go away, and the more conversant you can be in this topic, the better your reporting will be, the better your understanding of where news is headed.”
This emerging technology brings with it the opportunity to revolutionize the format in which news is delivered to the audience, just as other technological advances, such as the telegraph, once did, Rinehart said. In the 19th century, the telegraph sparked the emergence of the inverted pyramid format for news stories, which the AP standardized in its style manual in 1953.
Similarly, ChatGPT is creating a new way for the public to get their questions answered on the Internet, Rinehart said. No longer with links that lead to other websites, but with text that directly answers what the user needs. That's an opportunity for newsrooms to find ways to meet that new audience preference with the integration of generative AI models into their reader interaction processes.
“We have learned a lot through that question-answer thing that ChatGPT has brought to public awareness. And that's how people like information answered for them, they don't want links. It's clear they want a paragraph that explains it. So can newsrooms learn from that and integrate that into their offering?,” Rinehart asked.
She said that employment trends in the journalism industry will mainly be directed toward increased demand for positions related to SEO (search engine optimization), and editing and proofreading jobs.
For his part, Hamilton added that the so-far limited ability of generative AI models to generate text from scratch safeguards the jobs of reporters and writers. But these models will eventually be integrated into a dynamic of co-working with journalists.
“The development of these models as the trajectory currently is, is to simply be co-working with us, that it won't entirely automate jobs away,” he said. “It's really important to stay aware of the space like Aimee is saying, [...] to actually use these tools on a day-to-day basis and figure out where it slots into your own workflow, because it's going to be different for everyone.”
In generative AI, the prompt is an instruction given to models to generate the desired content or response. The clearer and more structured the prompt, the more accurate the platform's response will be. According to the panelists, prompts are a way for users to guide models such as ChatGPT to obtain useful material.
The process of designing effective prompts to elicit relevant responses is known as prompt engineering. Journalists who want to get the most out of generative AI must become good prompt engineers, according to the panelists. The more context and instructions the prompt includes, the better the result.
“We can all become better prompt engineers in terms of asking more thorough questions,” Rinehart said. “Marketing people are doing better than journalists at prompting, because, you know, a 3-word prompt isn't going to get you much. But starting off [prompting the model] with something like ‘pretend you are a college professor assigning a second year student. This assignment…,’ and so on. Those details will get you maybe closer to the results that you want.”
Developing effective prompts for the purposes of each newsroom can be time-consuming. This is because it is a trial-and-error activity, and because it takes a certain amount of intuition and getting to know the model better, Hamilton said. Users should be aware that GPT models are not capable of fully understanding what they are being told. Hence the importance of creating prompts with sufficient information and context.
“Sometimes you have to be like an ‘AI whisperer,’ and that's both good and bad. So, on the good side it holds the promise of engineering, that you can make it deterministic and safe and dependable,” he said. “But it's also because you're dealing with a model that we don't fully understand, once again, prompt engineering is a little bit of a line in the sense that you can't engineer yourself to a perfect result every single time. It depends a lot on intuition. You have to work with the model a ton to begin to get a sense of how it thinks.”
Hamilton and Reinhart agreed on the need to be careful not to share sensitive or confidential information in prompts. While OpenAI implemented a policy this year of not using information that users provide to train the ChatGPT model, it is still uncertain how much data remains in the clouds on these platforms.
“A lot of these tools take and collect and use whatever you put in there. So if you're working on a big juicy story, I would stay away from any type of inputting to the cloud,” Rinehart said.
For journalists who want an in-depth look at how to use this new technology, the Knight Center will offer the free online course "How to use ChatGPT and other generative AI tools in your newsroom." The course will run from Sept. 25 to Oct. 22, 2023, announced Rosental Alves, director of the Knight Center, at the end of the webinar.
Banner: Image created with artificial intelligence through Microsoft Bing