Headlines This Week
- In a big win for human artists, a Washington D.C. judge has ruled that AI-generated art lacks copyright protections.
- Meta has released SeamlessM4T, an automated speech and text translator that works in dozens of languages.
- New research shows that content farms are using AI to rip off and repackage news articles from legacy media sites. We have an interview with one of the researchers who uncovered this mess below.
- Last but not least: Stephen King has some thoughts about the fact that his books were used to train text-generating algorithms.
The Top Story: Cruise’s Big Stumble
Advertisement
For years, Silicon Valley has promised us self-driving cars. For just as many years, imperfect tech has thwarted those promises. In recent weeks, though, it’s seemed like our dreams of a driverless future might finally be coming true. In a decision handed down Aug. 10, the California Public Utilities Commission approved expanded operations for two major “robotaxi” companies—Google’s Waymo and GM’s Cruise. Both companies, which have been testing their automated vehicles in the Bay Area for years, were essentially given free rein to set up shop and start making money off their driverless carriages.
This has rightfully been hailed as a really big deal for the autonomous transportation industry, as it’s pretty much the first time that self-driving cars have been unleashed in this way. According to the CPUC ruling, Waymo is now allowed to operate “commercial passenger service” via its driverless vehicles, and its cars will be able to travel freely throughout both San Francisco and certain areas of San Mateo county; they’ll be allowed to do that any hour of the day, at speeds of up to 65 mph, in any prevailing weather conditions. Cruise has been allowed similar privileges in SF, at speeds of up to 35 mph. Additionally, neither company needs to staff its self-driving taxis with “safety operators,” the human chaperones who have traditionally helped guide automated vehicles.
Advertisement
Advertisement
In short: as of last week, it really looked like both companies were ready to hit the road and never look back.
But this brief moment of triumph was almost immediately cut short by an unfortunate series of events. Late Thursday night, one of Cruise’s taxis slammed into a fire truck in the Tenderloin district, sending a Cruise employee to the hospital. Not long afterward, another Cruise taxi stalled out at a city intersection, causing significant traffic delays in the area. Overnight, Cruise’s successes seemed to evaporate. On Friday, the Department of Motor Vehicles ordered the company to halve the number of vehicles it had on the city’s roadways, citing “recent concerning incidents” involving its cars. The company dutifully complied, rolling back 50 percent of its fleet.
This turn of events now puts autonomous travel at a weird crossroads. With the regulatory strictures loosened, it’s likely that these cars will become an ever bigger part of our lives. The future we’ve been promised is one in which daily travel is a fully automated luxury experience; your robotaxi will barrel down the freeway, using only its expertly designed algorithms to navigate, while you take a nap in the driver’s seat or watch a movie on your iPhone. But is that really how things are going to be? Or will self-driving vehicles mostly serve to clog up intersections, cause fender benders, or worse?
Barry Brown, a computer science professor at Stockholm University in Copenhagen, told Gizmodo that, despite the hype, self-driving cars are still far behind where they need to be when it comes to navigating complex roadway systems. Brown has studied self-driving cars for years and says that there’s one thing that they are not particularly good at: reading the room—or the road, as it were. “They struggle to understand other drivers’ intentions,” he said. “We humans are actually very good at doing that but these self-driving cars really struggle to work that one out.”
The problem, from Brown’s perspective, is that roadways are actually social domains, rich with subtle interpersonal cues that tell drivers how to interact with one another and their surrounding environment. Self-driving cars, unfortunately, are not very good at picking up on those cues—and are more akin children who haven’t been socialized properly yet.
Advertisement
“We don’t let five-year-olds drive. We wait until people are at an age where they have a lot of experience understanding how other people move,” said Brown. “We’re all kinda experts at navigating through crowds of people and we bring that understanding to bear when we’re driving as well. Self-driving cars, while they’re very good at predicting trajectory and movement, they struggle to pick up on the cues of other road-users to understand what’s happening.” Complex urban environments are something that these vehicles are not going to be ready to navigate anytime soon, he adds. “You’ve got these basic issues of things like yielding, but then if you get more complicated situations—if there’s cyclists, when there’s pedestrians on the road, when there’s very dense traffic, like in New York—these problems escalate and become even harder.”
The Interview: NewsGuard’s Jack Brewster on the Rise of the Plagiarism Bot
Advertisement
This week, we talked to Jack Brewster, a senior analyst at NewsGuard, whose team recently published a report on how AI tools are being used by shoddy websites to plagiarize news content from legacy media sites. The report, which shines a light on the bizarre emergent world of AI content farming, shows that some sites appear to have fully automated the article-creation process, using bots to scrape news sites, then using AI chatbots to re-write that content into aggregated news, which is then monetized through ad deals. This interview has been edited for brevity and clarity.
How did you initially hear about this trend?
We’ve been tracking something we like to call UAINs—unreliable AI-generated news websites. Basically, it’s any site that seems to be a next-generation content farm that uses AI to pump out articles. As we were looking at these sites, I was noticing these publishing errors [many of the articles included blatant remnants of chatbot use, including phrases like “As an AI language model, I am not sure about the preferences of human readers…”]. I realized that never before have we had the ability to scramble and re-write a news article in the blink of an eye. I wanted to see how many sites were using AI to do this—and that was sorta the beginning of it.
Advertisement
Take me through the AI plagiarism process. How would a person or a website take a New York Times article, feed it into a chatbot, and get an “original” story?
One of the big takeaways here is that a lot of these sites appear to be doing this automatically—meaning they’ve totally automated the copying process. It’s likely that programmers for a site set up code where they have a few target websites; they use bots to crawl those websites for content, and then feed the data into a large language model API, like ChatGPT. Articles are then published automatically—no human required. That’s why I think we’re seeing these “error” messages come up, because the process isn’t seamless yet—at least, not for the sites we surveyed. Obviously, the next question is: well, if these are the sites that are more careless, how many hundreds—if not thousands—are a little bit more careful and are editing out those error messages or have made the process completely seamless.
Advertisement
What do you think the implications are for the news industry? You could argue that—if this trend gets big enough—it’ll be siphoning off loads of web traffic from legitimate media organizations.
I’ll say two things. The first and most important thing for the news industry to figure out is how to define this trend…Is it turbo-charged plagiarism or is it efficient aggregation? I think that’s up to the news outlets who are being impacted to talk about, and also for the courts to decide. The other thing I’ll say is that…[this trend] has an impact on our information ecosystem. Even if these sites are not pumping out misinformation per se, if they’re increasing, exponentially, the amount of articles that flood the pathways through which we get new information, it’s going to be very difficult for the average person to separate the quality content from the low quality content. That has an impact on our reading experience and how difficult it is to access quality information.
Advertisement
What about the AI industry? What responsibility do AI companies have to help resolve this issue?
What I will say is that one of the big things we came across when we were researching this story is watermarking…that was one of the things that we encountered when we were doing research about certain safe guards that could be put in place to stop this. Again, that’s for governments and politicians and AI companies themselves to decide [whether they want to pursue that].
Advertisement
Do you think that human journalists should be concerned about this? A significant percentage of the journalism industry now revolves around news aggregation. If you can get a robot to do that with very little effort, doesn’t it seem likely that media companies will move towards that model because they won’t have to pay an algorithm to generate content for them?
Yeah, I guess what I’ll say is that you can imagine a world where a few sites are creating original content and thousands and thousands of bots are copying, re-writing and spitting out versions of that original content. I think that is something we all should be concerned about.
Services Marketplace – Listings, Bookings & Reviews