It hardly takes ChatGPT to generate a list of reasons why generative artificial intelligence isn’t great. The fact that algorithms are often given creative work without permission, that they harbor troubling biases, and that they require large amounts of energy and water to train are all serious problems.
However, putting all that aside, it is amazing how powerful generative AI is in prototyping potentially useful new tools.
I was able to witness this firsthand by visiting the Sundai Club, a generative AI hackathon held every Sunday near the MIT campus. A few months ago, this group graciously agreed to have me sit down with them, and we decided to spend the session exploring tools that might be useful to journalists. The club is supported by a Cambridge non-profit called Æthos, which promotes the socially responsible use of AI.
The Sundai Club’s staff includes students from MIT and Harvard, several professional developers and product managers, and even one person who works in the military. Each event begins with a brainstorming of possible projects, then narrows it down to the final options that the group will actually try to build.
Notable proposals from the journalism hackathon include using a multimodal language model to track political posts on TikTok, automatically generating freedom of information requests and appeals, and using local courts to inform local news reporting. It included summarizing video clips of the hearings.
Ultimately, the group decided to build a tool to help reporters covering AI identify potentially interesting papers posted on Arxiv, a popular server for research paper preprints. My presence here probably upset them, considering I said at the conference that scouring Arxiv for interesting research was a top priority for me.
After coming up with the goal, the team’s programmers were able to use the OpenAI API to create word embeddings (mathematical representations of words and their meanings) for the Arxiv AI paper. This made it possible to analyze the data to find papers related to a particular term or to explore relationships between different research fields.
Using different word embeddings of Reddit threads and Google News searches, programmers created visualizations that display research papers and Reddit discussions and related news reports.
The resulting prototype, called AI News Hound, is only roughly ready, but it shows how large-scale language models can help mine information in interesting new ways. This is a screenshot of the tool used to search for the term “AI agent.” The two green squares closest to the news article and Reddit cluster represent research papers that may be included in the article about efforts to build AI agents.