Engadget

Engadget is a web magazine with obsessive daily coverage of everything new in gadgets and consumer electronics

Updated: 1 hour 17 min ago

OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company

Wed, 05/15/2024 - 00:46

Ilya Sutskever has announced on X, formerly known as Twitter, that he's leaving OpenAI almost a decade after he co-founded the company. He's confident that OpenAI "will build [artificial general intelligence] that is both safe and beneficial" under the leadership of CEO Sam Altman, President Greg Brockman and CTO Mira Murati, he continued. In his own post about Sutskever's departure, Altman called him "one of the greatest minds of our generation" and credited him for his work with the company. Jakub Pachocki, OpenAI's previous Director of Research who headed the development of GPT-4 and OpenAI Five, has taken Sutskever's role as Chief Scientist.

After almost a decade, I have made the decision to leave OpenAI. The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama, @gdb, @miramurati and now, under the…

— Ilya Sutskever (@ilyasut) May 14, 2024

While Sutskever and Altman praised each other in their farewell messages, the two were embroiled in the company's biggest scandal last year. In November, OpenAI's board of directors suddenly fired Altman and company President Greg Brockman. "[T]he board no longer has confidence in [Altman's] ability to continue leading OpenAI," the ChatGPT-maker announced back then. Sutskever, who was a board member, was involved in their dismissal and was the one who asked both Altman and Brockman to separate meetings where they were informed that they were being fired. According to reports that came out at the time, Altman and Sutskever had been butting heads when it came to how quickly OpenAI was developing and commercializing its generative AI technology.

Both Altman and Brockman were reinstated just five days after they were fired, and the original board was disbanded and replaced with a new one. Shortly before that happened, Sutskever posted on X that he "deeply regre[tted his] participation in the board's actions" and that he will do everything he can "to reunite the company." He then stepped down from his role as a board member, and while he remained Chief Scientist, The New York Times says he never really returned to work.

Sutskever shared that he's moving on to a new project that's "very personally meaningful" to him, though he has yet to share details about it. As for OpenAI, it recently unveiled GPT-4o, which it claims can recognize emotion and can process and generate output in text, audio and images.

Ilya and OpenAI are going to part ways. This is very sad to me; Ilya is easily one of the greatest minds of our generation, a guiding light of our field, and a dear friend. His brilliance and vision are well known; his warmth and compassion are less well known but no less…

— Sam Altman (@sama) May 14, 2024

This article originally appeared on Engadget at https://www.engadget.com/openai-co-founder-and-chief-scientist-ilya-sutskever-is-leaving-the-company-054650964.html?src=rss

Categories: Technology

Google Project Astra hands-on: Full of potential, but it’s going to be a while

Tue, 05/14/2024 - 18:56

At I/O 2024, Google’s teaser for Project Astra gave us a glimpse at where AI assistants are going in the future. It’s a multi-modal feature that combines the smarts of Gemini with the kind of image recognition abilities you get in Google Lens, as well as powerful natural language responses. However, while the promo video was slick, after getting to try it out in person, it's clear there’s a long way to go before something like Astra lands on your phone. So here are three takeaways from our first experience with Google’s next-gen AI.

Sam’s take:

Currently, most people interact with digital assistants using their voice, so right away Astra’s multi-modality (i.e. using sight and sound in addition to text/speech) to communicate with an AI is relatively novel. In theory, it allows computer-based entities to work and behave more like a real assistant or agent – which was one of Google’s big buzzwords for the show – instead of something more robotic that simply responds to spoken commands.

Photo by Sam Rutherford/Engadget

In our demo, we had the option of asking Astra to tell a story based on some objects we placed in front of camera, after which it told us a lovely tale about a dinosaur and its trusty baguette trying to escape an ominous red light. It was fun and the tale was cute, and the AI worked about as well as you would expect. But at the same time, it was far from the seemingly all-knowing assistant we saw in Google's teaser. And aside from maybe entertaining a child with an original bedtime story, it didn’t feel like Astra was doing as much with the info as you might want.

Then my colleague Karissa drew a bucolic scene on a touchscreen, at which point Astra correctly identified the flower and sun she painted. But the most engaging demo was when we circled back for a second go with Astra running on a Pixel 8 Pro. This allowed us to point its cameras at a collection of objects while it tracked and remembered each one’s location. It was even smart enough to recognize my clothing and where I had stashed my sunglasses even though these objects were not originally part of the demo.

In some ways, our experience highlighted the potential highs and lows of AI. Just the ability for a digital assistant to tell you where you might have left your keys or how many apples were in your fruit bowl before you left for the grocery store could help you save some real time. But after talking to some of the researchers behind Astra, there are still a lot of hurdles to overcome.

Photo by Sam Rutherford/Engadget

Unlike a lot of Google’s recent AI features, Astra (which is described by Google as a “research preview”) still needs help from the cloud instead of being able to run on-device. And while it does support some level of object permanence, those “memories” only last for a single session, which currently only spans a few minutes. And even if Astra could remember things for longer, there are things like storage and latency to consider, because for every object Astra recalls, you risk slowing down the AI, resulting in a more stilted experience. So while it’s clear Astra has a lot of potential, my excitement was weighed down with the knowledge that it will be some time before we can get more full-feature functionality.

Karissa’s take:

Of all the generative AI advancements, multimodal AI has been the one I’m most intrigued by. As powerful as the latest models are, I have a hard time getting excited for iterative updates to text-based chatbots. But the idea of AI that can recognize and respond to queries about your surroundings in real-time feels like something out of a sci-fi movie. It also gives a much clearer sense of how the latest wave of AI advancements will find their way into new devices like smart glasses.

Google offered a hint of that with Project Astra, which may one day have a glasses component, but for now is mostly experimental (the glasses shown in the demo video during the I/O keynote were apparently a “research prototype.”) In person, though, Project Astra didn’t exactly feel like something out of sci-fi flick.

Photo by Sam Rutherford/Engadget

It was able to accurately recognize objects that had been placed around the room and respond to nuanced questions about them, like “which of these toys should a 2-year-old play with.” It could recognize what was in my doodle and make up stories about different toys we showed it.

But most of Astra’s capabilities seemed on-par with what Meta has already made available with its smart glasses. Meta’s multimodal AI can also recognize your surroundings and do a bit of creative writing on your behalf. And while Meta also bills the features as experimental, they are at least broadly available.

The Astra feature that may set Google’s approach apart is the fact that it has a built-in “memory.” After scanning a bunch of objects, it could still “remember” where specific items were placed. For now, it seems Astra’s memory is limited to a relatively short window of time, but members of the research team told us that it could theoretically be expanded. That would obviously open up even more possibilities for the tech, making Astra seem more like an actual assistant. I don’t need to know where I left my glasses 30 seconds ago, but if you could remember where I left them last night, that would actually feel like sci-fi come to life.

But, like so much of generative AI, the most exciting possibilities are the ones that haven’t quite happened yet. Astra might get there eventually, but right now it feels like Google still has a lot of work to do to get there.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-project-astra-hands-on-full-of-potential-but-its-going-to-be-a-while-235607743.html?src=rss

Categories: Technology

Engadget Podcast: The good, the bad and the AI of Google I/O 2024

Tue, 05/14/2024 - 17:17

We just wrapped up coverage on Google's I/O 2024 keynote, and we're just so tired of hearing about AI. In this bonus episode, Cherlynn and Devindra dive into the biggest I/O news: Google's intriguing Project Astra AI assistant; new models for creating video and images; and some improvements to Gemini AI. While some of the announcements seem potentially useful, it's still tough to tell if the move towards AI will actually help consumers, or if Google is just fighting to stay ahead of OpenAI.

Listen below or subscribe on your podcast app of choice. If you've got suggestions or topics you'd like covered on the show, be sure to email us or drop a note in the comments! And be sure to check out our other podcast, Engadget News!

Subscribe!

Livestream Credits

Hosts: Cherlynn Low and Devindra Hardawar
Music: Dale North

This article originally appeared on Engadget at https://www.engadget.com/engadget-podcast-the-good-the-bad-and-the-ai-of-google-io-2024-221741082.html?src=rss

Categories: Technology

X now treats the term cisgender as a slur

Tue, 05/14/2024 - 16:11

The increasingly discriminatory X (Twitter) now considers the term “cisgender” a slur. Owner Elon Musk posted last June, to the delight of his bigoted brigade of blue-check sycophants, that “‘cis’ or ‘cisgender’ are considered slurs on this platform.” On Tuesday, X made good on the regressive provocateur’s stance and reportedly began posting an official warning that the LGBTQ-inclusive terms could result in a ban from the platform. Not that you’d miss much.

TechCrunch reported on Tuesday that trying to publish a post using the terms “cisgender” or “cis” in the X mobile app will pop up a full-screen warning reading, “This post contains language that may be considered a slur by X and could be used in a harmful manner in violation of our rules.” It then gives you the choice of continuing to publish the post or conforming to the backward views of the worst of us and deleting it.

Of course, neither form of the term cisgender is a slur.

As the historically marginalized transgender community finally began finding at least a sliver of widespread and long overdue social acceptance in the 21st century, the term became more commonly used in the mainstream lexicon to describe people whose gender identity matches their sex at birth. Organizations including the American Psychological Association, World Health Organization, American Medical Association, American Psychiatric Association recognize the term.

But some people have a hard time accepting and respecting that some humans are different from others. Those fantasizing (against all evidence and scientific consensus) that the heteronormative ideals they grew up with are absolute gospel sometimes take great offense at being asked to adjust their vocabulary to communicate respect for a community that has spent centuries forced to live in the shadows or risk their safety due to the widespread pathologization of their identities.

Musk seems to consider those the good ol’ days.

This isn’t the billionaire’s first ride on the Transphobe Train. After his backward tweet last June (on the first day of Pride Month, no less), the edgelord’s platform ran a timeline takeover ad from a right-wing nonprofit, plugging a transphobic propaganda film. In case you’re wondering if the group may have anything of value to say, TechCrunch notes that the same organization also doubts climate change and downplays the dehumanizing atrocities of slavery.

X also reversed course on a policy, implemented long before Musk’s takeover, that banned the deadnaming or misgendering of transgender people.

This article originally appeared on Engadget at https://www.engadget.com/x-now-treats-the-term-cisgender-as-a-slur-211117779.html?src=rss

Categories: Technology

Everything announced at Google I/O 2024 including Gemini AI, Project Astra, Android 15 and more

Tue, 05/14/2024 - 16:04

At the end of I/O, Google’s annual developer conference at the Shoreline Amphitheater in Mountain View, Google CEO Sundar Pichai revealed that the company had said “AI” 121 times. That, essentially, was the crux of Google’s two-hour keynote — stuffing AI into every Google app and service used by more than two billion people around the world. Here are all the major updates that Google announced at the event.

Gemini 1.5 Flash and updates to Gemini 1.5 Pro Google

Google announced a brand new AI model called Gemini 1.5 Flash, which it says is optimised for speed and efficiency. Flash sits between Gemini 1.5 Pro and Gemini 1.5 Nano, which its the company’s smallest model that runs locally on device. Google said that it created Flash because developers wanted a lighter and less expensive model than Gemini Pro to build AI-powered apps and services while keeping some of the things like a long context window of one million tokens that differentiates Gemini Pro from competing models. Later this year, Google will double Gemini’s context window to two million tokens, which means that it will be able to process two hours of video, 22 hours of audio, more than 60,000 lines of code or more than 1.4 million words at the same time.

Project Astra Google

Google showed off Project Astra, an early version of a universal assistant powered by AI that Google’s DeepMind CEO Demis Hassabis said was Google’s version of an AI agent “that can be helpful in everyday life.”

In a video that Google says was shot in a single take, an Astra user moves around Google’s London office holding up their phone and pointing the camera at various things — a speaker, some code on a whiteboard, and out a window — and has a natural conversation with the app about what it seems. In one of the video’s most impressive moments, the correctly tells the user where she left her glasses before without the user ever having brought up the glasses.

The video ends with a twist — when the user finds and wears the missing glasses, we learn that they have an onboard camera system and are capable of using Project Astra to seamlessly carry on a conversation with the user, perhaps indicating that Google might be working on a competitor to Meta’s Ray Ban smart glasses.

Ask Google Photos Google

Google Photos was already intelligent when it came to searching for specific images or videos, but with AI, Google is taking things to the next level. If you’re a Google One subscriber in the US, you will be able to ask Google Photos a complex question like “show me the best photo from each national park I’ve visited" when the feature rolls out over the next few months. Google Photos will use GPS information as well as its own judgement of what is “best” to present you with options. You can also ask Google Photos to generate captions to post the photos to social media.

Veo and Imagen 3 Google

Google’s new AI-powered media creation engines are called Veo and Imagen 3. Veo is Google’s answer to OpenAI’s Sora. It can produce “high-quality” 1080p videos that can last “beyond a minute”, Google said, and can understand cinematic concepts like a timelapse.

Imagen 3, meanwhile, is a text-to-image generator that Google claims handles text better than its previous version, Imagen 2. The result is the company’s highest quality” text-to-image model with “incredible level of detail” for “photorealistic, lifelike images” and fewer artifacts — essentially pitting it against OpenAI’s DALLE-3.

Big updates to Google Search

Google

Google is making big changes to how Search fundamentally works. Most of the updates announced today like the ability to ask really complex questions (“Find the best yoga or pilates studios in Boston and show details on their intro offers and walking time from Beacon Hill.”) and using Search to plan meals and vacations won’t be available unless you opt in to Search Labs, the company’s platform that lets people try out experimental features.

But a big new feature that Google is calling AI Overviews and which the company has been testing for a year now, is finally rolling out to millions of people in the US. Google Search will now present AI-generated answers on top of the results by default, and the company says that it will bring the feature to more than a billion users around the world by the end of the year.

Gemini on Android

Google

Google is integrating Gemini directly into Android. When Android 15 releases later this year, Gemini will be aware of the app, image or video that you’re running, and you’ll be able to pull it up as an overlay and ask it context-specific questions. Where does that leave Google Assistant that already does this? Who knows! Google didn’t bring it up at all during today’s keynote.

There were a bunch of other updates too. Google said it would add digital watermarks to AI-generated video and text, make Gemini accessible in the side panel in Gmail and Docs, power a virtual AI teammate in Workspace, listen in on phone calls and detect if you’re being scammed in real time, and a lot more.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/everything-announced-at-google-io-2024-including-gemini-ai-project-astra-android-15-and-more-210414580.html?src=rss

Categories: Technology

Animal Well speedrunners are already beating the game in under five minutes

Tue, 05/14/2024 - 14:53

Animal Well is one of the hottest games around. It quickly shot to the top of Steam's top-seller chart after it was released to glowing reviews last Thursday.

While most players complete the main story in four to six hours, it hasn't taken long for speedrunners to figure out how to blaze through solo developer Billy Basso's eerie labyrinth. YouTubers are already posting runs of under five minutes and the any% record (i.e. the best recorded time without any restrictions) is being smashed over and over.

Within a couple of hours of Hubert0987 claiming the world record with a 4:44 run on Thursday, The DemonSlayer6669 appeared to snag bragging rights with one that was 18 seconds faster and perhaps the first recorded sub-4:30 time. (Don't watch the video just yet if you haven't beaten the game and would like to avoid spoilers.)

Animal Well hasn't even been out for a week, so you can expect records to keep tumbling as runners optimize routes to the game's final plunger. It's cool to already see a speedrunning community form around a new game as skilled players duke it out, perhaps for the chance to show off their skills at the next big Games Done Quick event.

This article originally appeared on Engadget at https://www.engadget.com/animal-well-speedrunners-are-already-beating-the-game-in-under-five-minutes-195259598.html?src=rss

Categories: Technology

Google expands digital watermarks to AI-made video and text

Tue, 05/14/2024 - 13:55

As Google starts to make its latest video-generation tools available, the company says it has a plan to ensure transparency around the origins of its increasingly realistic AI-generated clips. All video made by the company’s new Veo model in the VideoFX app will have digital watermarks thanks to Google’s SynthID system. Furthermore, SynthID will be able to watermark AI-generated text that comes from Gemini.

SynthID is Google’s digital watermarking system that started rolling out to AI-generated images last year. The tech embeds imperceptible watermarks into AI-made content so that AI detection tools can recognize that the content was generated by AI. Considering that Veo, the company’s latest video generation model previewed onstage at I/O, can create longer and higher-res clips than what was previously possible, tracking the source of such content will be increasingly important.

As generative AI models advance, more companies have turned to watermarking amid fears that AI could fuel a new wave of misinformation. Watermarking systems would give platforms like Google a framework for detecting AI-generated content that may otherwise be impossible to distinguish. TikTok and Meta have also recently announced plans to support similar detection tools on their platforms and label more AI content in their apps.

Of course, there are still significant questions about whether digital watermarks on their own offer sufficient protection against deceptive AI content. Researchers have shown that watermarks can be easy to evade. But making AI-made content detectable in some way is an important first step toward transparency.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-expands-digital-watermarks-to-ai-made-video-175232320.html?src=rss

Categories: Technology

Gemini will be accessible in the side panel on Google apps like Gmail and Docs

Tue, 05/14/2024 - 13:54

Google is adding Gemini-powered AI automation to more tasks in Workspace. In its Tuesday Google I/O keynote, the company said its advanced Gemini 1.5 Pro will soon be available in the Workspace side panel as “the connective tissue across multiple applications with AI-powered workflows,” as AI grows more intelligent, learns more about you and automates more of your workflow.

Gemini’s job in Workspace is to save you the time and effort of digging through files, emails and other data from multiple apps. “Workspace in the Gemini era will continue to unlock new ways of getting things done,” Google Workspace VP Aparna Pappu said at the event.

The refreshed Workspace side panel, coming first to Gmail, Docs, Sheets, Slides and Drive, will let you chat with Gemini about your content. Its longer context window (essentially, its memory) allows it to organize, understand and contextualize your data from different apps without leaving the one you’re in. This includes things like comparing receipt attachments, summarizing (and answering back-and-forth questions about) long email threads, or highlighting key points from meeting recordings.

Google

Another example Google provided was planning a family reunion when your grandmother asks for hotel information. With the Workspace side panel, you can ask Gemini to find the Google Doc with the booking information by using the prompt, “What is the hotel name and sales manager email listed in @Family Reunion 2024?” Google says it will find the document and give you a quick answer, allowing you to insert it into your reply as you save time by faking human authenticity for poor Grandma.

The email-based changes are coming to the Gmail mobile app, too. “Gemini will soon be able to analyze email threads and provide a summarized view with the key highlights directly in the Gmail app, just as you can in the side panel,” the company said.

Summarizing in the Gmail app is coming to Workspace Labs this month. Meanwhile, the upgraded Workspace side panel will arrive starting Tuesday for Workspace Labs and Gemini for Workspace Alpha users. Google says all the features will arrive for the rest of Workspace customers and Google One AI Premium users next month.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/gemini-will-be-accessible-in-the-side-panel-on-google-apps-like-gmail-and-docs-185406695.html?src=rss

Categories: Technology

Google Gemini can power a virtual AI teammate with its own Workspace account

Tue, 05/14/2024 - 13:28

Google's Gemini AI systems can do a lot, judging by today's I/O keynote. That includes the option to set up a virtual teammate with its own Workspace account. You can configure the teammate to carry out specific tasks, such as to monitor and track projects, organize information, provide context, pinpoint trends after analyzing data and to play a role in team collaboration.

In Google Chat, the teammate can join all relevant rooms and you can ask it questions based on all the conversation histories, Gmail threads and anything else it has access to. It can tell team members whether their projects are approved or if there might be an issue based on conflicting messages.

It seems like the virtual teammate was just a tech demo for now, however. Aparna Pappu, vice president and GM of Workspace, said Google has "a lot of work to do to figure out how to bring these agentive experiences, like virtual teammates, into Workspace." That includes finding ways to let third parties make their own versions.

While it doesn't seem like this virtual teammate will be available soon, it could eventually prove to be a serious timesaver — as long as you trust it to get everything right first time around.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-gemini-can-power-a-virtual-ai-teammate-with-its-own-workspace-account-182809274.html?src=rss

Categories: Technology

Google announces new scam detection tools that provide real-time alerts during phone calls

Tue, 05/14/2024 - 13:14

Google just announced forthcoming scam detection tools coming to Android phones later this year, which is a good thing as these scammers keep getting better and better at parting people from their money. The toolset, revealed at Google I/O 2024, is still in the testing stages but uses AI to suss out fraudsters in the middle of a conversation.

You read that right. The AI will be constantly on the hunt for conversation patterns commonly associated with scams. Once detected, you’ll receive a real-time alert on the phone, putting to bed any worries that the person on the other end is actually heading over to deliver a court summons or whatever.

Google gives the example of a “bank representative” asking for personal information, like PINs and passwords. These are uncommon bank requests, so the AI would flag them and issue an alert. Everything happens on the device, so it stays private. This feature isn’t coming to Android 15 right away and the company says it’ll share more details later in the year. We do know that people will have to opt-in to use the tool.

Google made a big move with Android 15, bringing its Gemini chatbot to actual devices instead of requiring a connection to the cloud. In addition to this scam detection tech, the addition of onboard AI will allow for many more features, like contextual awareness when using apps.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-announces-new-scam-detection-tools-that-provide-real-time-alerts-during-phone-calls-181442091.html?src=rss

Categories: Technology

With Gemini Live, Google wants you to relax and have a natural chat with AI

Tue, 05/14/2024 - 13:13

While Google and OpenAI have been racing to win the AI crown over the past year, we've seemingly reverted away from the idea of speaking to virtual assistants. Generative AI products have typically launched with text-only inputs, and only later add the ability to search images and basic voice commands. At Google I/O today, the company showed off Gemini Live, a new mobile experience for natural conversations with its AI.

Google offered up a few potential use cases; You could have a conversation with Gemini Live to help prepare for a job interview, where it could potentially ask you relevant questions around the positions. It could also give you public speaking tips if you want to research a speech. What makes Gemini Live unique is that you'll be able to speak at your own pace, or even interrupt its responses if you'd like. Ideally, it should be more like having a conversation with a person, instead of just voicing smart assistant commands or generative AI queries.

At I/O, Google also showed off Project Astra, a next-generation virtual assistant that takes the concept of Gemini Live even further. Astra is able to view your camera feed and answer questions in real-time. It's unclear how long that'll take to arrive, but Google says some of Astra's live video features will come to Gemini Live later this year. Gemini Live will be available for Gemini Advanced subscribers in the next few months.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/with-gemini-live-google-wants-you-to-relax-and-have-a-natural-chat-with-ai-181329788.html?src=rss

Categories: Technology

Google's Gemini Nano brings better image-description smarts to its TalkBack vision tool

Tue, 05/14/2024 - 13:07

The Google I/O event is here, and the company is announcing lots of great updates for your Android device. As we heard earlier, Gemini Nano is getting multimodal support, meaning your Android will still process text but with a better understanding of other factors like sights, sounds and spoken language. Now Google has shared that the new tool is also coming to it's TalkBack feature.

TalkBack is an existing tool that reads aloud a description of an image, whether it's one you captured or from the internet. Gemini Nano's multimodal support should provide a more detailed understanding of the image. According to Google, TalkBack users encounter about 90 images each day that don't have a label. Gemini Nano should be able to provide missing information, such as what an item of clothing looks like or the details of a new photo sent by a friend.

Gemini Nano works directly on a person's device, meaning it should still function properly without any network connection. While we don't yet have an exact date for when it will arrive, Google says TalkBack will get Gemini Nano's updated features later this year.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/googles-gemini-nano-brings-better-image-description-smarts-to-its-talkback-vision-tool-180759598.html?src=rss

Categories: Technology

Google builds Gemini right into Android, adding contextual awareness within apps

Tue, 05/14/2024 - 13:04

Google just announced some nifty improvements to its Gemini AI chatbot for Android devices as part of the company’s I/O 2024 event. The AI is now part of the Android operating system, allowing it to integrate in a more comprehensive way.

The coolest new feature wouldn’t be possible without that integration with the underlying OS. Gemini is now much better at understanding context as you control apps on the smartphone. What does this mean exactly? Once the tool officially launches as part of Android 15, you’ll be able to bring up a Gemini overlay that rests on top of the app you’re using. This will allow for context-specific actions and queries.

Google gives the example of quickly dropping generated images into Gmail and Google Messages, though you may want to steer clear of historical images for now. The company also teased a feature called “Ask This Video” that lets users pose questions about a particular YouTube video, which the chatbot should be able to answer. Google says this should work with "billions" of videos. There's a similar tool coming for PDFs.

Google

It’s easy to see where this tech is going. Once Gemini has access to the lion’s share of your app library, it should be able to actually deliver on some of those lofty promises made by rival AI companies like Humane and Rabbit. Google says it's “just getting started with how on-device AI can change what your phone can do” so we imagine future integration with apps like Uber and Doordash, at the very least.

Circle to Search is also getting a boost thanks to on-board AI. Users will be able to circle just about anything on their phone and receive relevant information. Google says people will be able to do this without having to switch apps. This even extends to math and physics problems, just circle for the answer, which is likely to please students and frustrate teachers.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-builds-gemini-right-into-android-adding-contextual-awareness-within-apps-180413356.html?src=rss

Categories: Technology

Android's Circle to Search can now help students solve math and physics homework

Tue, 05/14/2024 - 13:02

Google has introduced another capability for its Circle to Search feature at the company's annual I/O developer conference, and it's something that could help students better understand potentially difficult class topics. The feature will now be able to show them step-by-step instructions for a "range of physics and math word problems." They just have to activate the feature by long-pressing the home button or navigation bar and then circling the problem that's got them stumped, though some math problems will require users to be signed up for Google's experimental Search Labs feature.

The company says Circle to Search's new capability was made possible by its new family of AI models called LearnLM that was specifically created and fine-tuned for learning. It's also planning to make adjustments to this particular capability and to roll out an upgraded version later this year that could solve even more complex problems "involving symbolic formulas, diagrams, graphs and more." Google launched Circle to Search earlier this year at a Samsung Unpacked event, because the feature was initially available on Galaxy 24, as well as on Pixel 8 devices. It's now also out for the Galaxy S23, Galaxy S22, Z Fold, Z Flip, Pixel 6 and Pixel 7 devices, and it'll likely make its way to more hardware in the future.

In addition to the new Circle to Search capability, Google has also revealed that devices that can support the Gemini for Android chatbot assistant will now be able to bring it up as an overlay on top of the application that's currently open. Users can then drag and drop images straight from the overlay into apps like Gmail, for instance, or use the overlay to look up information without having to swipe away from whatever they're doing. They can tap "Ask this video" to find specific information within a YouTube video that's open, and if they have access to Gemini Advanced, they can use the "Ask this PDF" option to find information from within lengthy documents.

Google is also rolling out multimodal capabilities to Nano, the smallest model in the Gemini family that can process information on-device. The updated Gemini Nano, which will be able to process sights, sounds and spoken language, is coming to Google's TalkBack screen reader later this year. Gemini Nano will enable TalkBack to describe images onscreen more quickly and even without an internet connection. Finally, Google is currently testing a Gemini Nano feature that can alert users while a call is ongoing if it detects common conversation patterns associated with scams. Users will be alerted, for instance, if they're talking to someone asking them for their PINs or passwords or to someone asking them to buy gift cards.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/androids-circle-to-search-can-now-help-students-solve-math-and-physics-homework-180223229.html?src=rss

Categories: Technology

Google's Gemini will search your videos to help you solve problems

Tue, 05/14/2024 - 12:52

As part of its push toward adding generative AI to search, Google has introduced a new twist: video. Gemini will let you upload video that demonstrates an issue you're trying to resolve, then scour user forums and other areas of the internet to find a solution.

As an example, Google's Rose Yao talked onstage at I/O 2024 about a used turntable she bought and how she couldn't get the needle to sit on the record. Yao uploaded a video showing the issue, then Gemini quickly found an explainer describing how to balance the arm on that particular make and model.

Google

"Search is so much more than just words in a text box. Often the questions you have are about the things you see around you, including objects in motion," Google wrote. "Searching with video saves you the time and trouble of finding the right words to describe this issue, and you’ll get an AI Overview with steps and resources to troubleshoot."

If the video alone doesn't make it clear what you're trying to figure out, you can add text or draw arrows that point to the issue in question.

OpenAI just introduced ChatGPT 4o with the ability to interpret live video in real time, then describe a scene or even sing a song about it. Google, however, is taking a different tack with video by focusing on its Search product for now. Searching with video is coming to Search Labs US users in English to start with, but will expand to more regions over time, the company said.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/googles-gemini-will-search-your-videos-to-help-you-solve-problems-175235105.html?src=rss

Categories: Technology

Google Search will now show AI-generated answers to millions by default

Tue, 05/14/2024 - 12:45

Google is shaking up Search. On Tuesday, the company announced big new AI-powered changes to the world’s dominant search engine at I/O, Google’s annual conference for developers. With the new features, Google is positioning Search as more than a way to simply find websites. Instead, the company wants people to use its search engine to directly get answers and help them with planning events and brainstorming ideas.

“[With] generative AI, Search can do more than you ever imagined,” wrote Liz Reid, vice president and head of Google Search, in a blog post. “So you can ask whatever’s on your mind or whatever you need to get done — from researching to planning to brainstorming — and Google will take care of the legwork.”

Google’s changes to Search, the primary way that the company makes money, are a response to the explosion of generative AI ever since OpenAI’s ChatGPT released at the end of 2022. Since then, a handful of AI-powered apps and services including ChatGPT, Anthropic, Perplexity, and Microsoft’s Bing, which is powered by OpenAI’s GPT-4, have challenged Google’s flagship service by directly providing answers to questions instead of simply presenting people a list of links. This is the gap that Google is racing to bridge with its new features in Search.

Starting today, Google will show complete AI-generated answers in response to most search queries at the top of the results page in the US. Google first unveiled the feature a year ago at Google I/O in 2023, but so far, anyone who wanted to use the feature had to sign up for it as part of the company’s Search Labs platform that lets people try out upcoming features ahead of their general release. Google is now making AI Overviews available to hundreds of millions of Americans, and says that it expects it to be available in more countries to over a billion people by the end of the year. Reid wrote that people who opted to try the feature through Search Labs have used it “billions of times” so far, and said that any links included as part of the AI-generated answers get more clicks than if the page had appeared as a traditional web listing, something that publishers have been concerned about. “As we expand this experience, we’ll continue to focus on sending valuable traffic to publishers and creators,” Reid wrote.

In addition to AI Overviews, searching for certain queries around dining and recipes, and later with movies, music, books, hotels, shopping and more in English in the US will show a new search page where results are organized using AI. “[When] you’re looking for ideas, Search will use generate AI to brainstorm with you and create an AI-organized results page that makes it easy to explore,” Reid said in the blog post.

Google

If you opt in to Search Labs, you’ll be able to access even more features powered by generative AI in Google Search. You’ll be able to get AI Overview to simplify the language or break down a complex topic in more detail. Here’s an example of a query asking Google to explain, for instance, the connection between lightning and thunder.

Google

Search Labs testers will also be able to ask Google really complex questions in a single query to get answers on a single page instead of having to do multiple searches. The example that Google’s blog post gives: “Find the best yoga or pilates studios in Boston and show details on their intro offers and walking time from Beacon Hill.” In response, Google shows the highest-rated yoga and pilates studios near Boston’s Beacon Hill neighborhood and even puts them on a map for easy navigation.

Google

Google also wants to become a meal and vacation planner by letting people who sign up for Search Labs ask queries like “create a 3 day meal plan for a group that’s easy to prepare” and letting you swap out individual results in its AI-generated plan with something else (swapping a meat-based dish in a meal plan for a vegetarian one, for instance).

Google

Finally, Google will eventually let anyone who signs up for Search Labs use a video as a search query instead of text or images. “Maybe you bought a record player at a thriftshop, but it’s not working when you turn it on and the metal piece with the needle is drifting unexpectedly,” wrote Reid in Google’s blog post. “Searching with video saves you the time and trouble of finding the right words to describe this issue, and you’ll get an AI Overview with steps and resources to troubleshoot.”

Google said that all these new capabilities are powered by a brand new Gemini model customized for Search that combines Gemini’s advanced multi-step reasoning and multimodal abilities with Google’s traditional search systems.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-search-will-now-show-ai-generated-answers-to-millions-by-default-174512845.html?src=rss

Categories: Technology

Google unveils Veo and Imagen 3, its latest AI media creation models

Tue, 05/14/2024 - 12:36

It's all AI all the time at Google I/O! Today, Google announced its new AI media creation engines: Veo, which can produce "high-quality" 1080p videos; and Imagen 3, its latest text-to-image framework. Neither sound particularly revolutionary, but they're a way for Google to keep up the fight against OpenAI's Sora video model and Dall-E 3, a tool that has practically become synonymous with AI-generated images.

Google claims Veo has "an advanced understanding of natural language and visual semantics" to create whatever video you have in mind. The AI generated videos can last "beyond a minute." Veo is also capable of understanding cinematic and visual techniques, like the concept of a timelapse. But really, that should be table stakes for an AI video generation model, right?

To prove that Veo isn't out to steal artist's jobs, Google has also partnered with Donald Glover and Gilga, his creative studio, to show off the model's capabilities. In a very brief promotional video, we see Glover and crew using text to create video of a convertible arriving at a European home, and a sailboat gliding through the ocean. According to Google, Veo can simulate real-world physics better than its previous models, and it's also improved how it renders high-definition footage.

"Everybody's going to become a director, and everybody should be a director," Glover says in the video, absolutely earning his Google paycheck. "At the heart of all of this is just storytelling. The closer we are to be able to tell each other our stories, the more we'll understand each other."

It remains to be seen if anyone will actually want to watch AI generated video, outside of the morbid curiosity of seeing a machine attempt to algorithmically recreate the work of human artists. But that's not stopping Google or OpenAI from promoting these tools and hoping they'll be useful (or at least, make a bunch of money). Veo will be available inside of Google's VideoFX tool today for some creators, and the company says it'll also be coming to YouTube Shorts and other products. If Veo does end up becoming a built-in part of YouTube Shorts, that's at least one feature Google can lord over TikTok.

Google

As for Imagen 3, Google is making the usual promises: It's said to be the company's "highest quality" text-to-image model, with "incredible level of detail" for "photorealistic, lifelike images" and fewer artifacts. The real test, of course, will be to see how it handles prompts compared to Dall-E 3. Imagen 3 handles text better than before, Google says, and it's also smarter about handling details from long prompts.

Google is also working with recording artists like Wyclef Jean and Bjorn to test out its Music AI Sandbox, a set of tools that can help with song and beat creation. We only saw a brief glimpse of this, but it's led to a few intriguing demos:

The sun rises and sets. We're all slowly dying. And AI is getting smarter by the day. That seems to be the big takeaway from Google's latest media creation tools. Of course they're getting better! Google is pouring billions into making the dream of AI a reality, all in a bid to own the next great leap for computing. Will any of this actually make our lives better? Will they ever be able to generate art with genuine soul? Check back at Google I/O every year until AGI actually appears, or our civilization collapses.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-unveils-veo-and-imagen-3-its-latest-ai-media-creation-models-173617373.html?src=rss

Categories: Technology

Google just snuck a pair of AR glasses into a Project Astra demo at I/O

Tue, 05/14/2024 - 12:28

In a video showcasing the prowess of Google's new Project Astra experience at I/O 2024, an unnamed person demonstrating asked Gemini "do you remember where you saw my glasses?" The AI impressively responded "Yes, I do. Your glasses were on a desk near a red apple," despite said object not actually being in view when the question was asked. But these glasses weren't your bog-standard assistive vision aid; these had a camera onboard and some sort of visual interface!

The tester picked up their glasses and put them on, and proceeded to ask the AI more questions about things they were looking at. Clearly, there is a camera on the device that's helping it take in the surroundings, and we were shown some sort of interface where a waveform moved to indicate it was listening. Onscreen captions appeared to reflect the answer that was being read aloud to the wearer, as well. So if we're keeping track, that's at least a microphone and speaker onboard too, along with some kind of processor and battery to power the whole thing.

We only caught a brief glimpse of the wearable, but from the sneaky seconds it was in view, a few things were evident. The glasses had a simple black frame and didn't look at all like Google Glass. They didn't appear very bulky, either.

In all likelihood, Google is not ready to actually launch a pair of glasses at I/O. It breezed right past the wearable's appearance and barely mentioned them, only to say that Project Astra and the company's vision of "universal agents" could come to devices like our phones or glasses. We don't know much else at the moment, but if you've been mourning Google Glass or the company's other failed wearable products, this might instill some hope yet.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/google-just-snuck-a-pair-of-ar-glasses-into-a-project-astra-demo-at-io-172824539.html?src=rss

Categories: Technology

Google's Project Astra uses your phone's camera and AI to find noise makers, misplaced items and more.

Tue, 05/14/2024 - 12:28

When Google first showcased its Duplex voice assistant technology at its developer conference in 2018, it was both impressive and concerning. Today, at I/O 2024, the company may be bringing up those same reactions again, this time by showing off another application of its AI smarts with something called Project Astra.

The company couldn't even wait till its keynote today to tease Project Astra, posting a video to its social media of a camera-based AI app yesterday. At its keynote today, though, Google's DeepMind CEO Demis Hassabis shared that his team has "always wanted to develop universal AI agents that can be helpful in everyday life." Project Astra is the result of progress on that front.

What is Project Astra?

According to a video that Google showed during a media briefing yesterday, Project Astra appeared to be an app which has a viewfinder as its main interface. A person holding up a phone pointed its camera at various parts of an office and verbally said "Tell me when you see something that makes sound." When a speaker next to a monitor came into view, Gemini responded "I see a speaker, which makes sound."

The person behind the phone stopped and drew an onscreen arrow to the top circle on the speaker and said, "What is that part of the speaker called?" Gemini promptly responded "That is the tweeter. It produces high-frequency sounds."

Then, in the video that Google said was recorded in a single take, the tester moved over to a cup of crayons further down the table and asked "Give me a creative alliteration about these," to which Gemini said "Creative crayons color cheerfully. They certainly craft colorful creations."

Wait, were those Project Astra glasses? Is Google Glass back?

The rest of the video goes on to show Gemini in Project Astra identifying and explaining parts of code on a monitor, telling the user what neighborhood they were in based on the view out the window. Most impressively, Astra was able to answer "Do you remember where you saw my glasses?" even though said glasses were completely out of frame and were not previously pointed out. "Yes, I do," Gemini said, adding "Your glasses were on a desk near a red apple."

After Astra located those glasses, the tester put them on and the video shifted to the perspective of what you'd see on the wearable. Using a camera onboard, the glasses scanned the wearer's surroundings to see things like a diagram on a whiteboard. The person in the video then asked "What can I add here to make this system faster?" As they spoke, an onscreen waveform moved to indicate it was listening, and as it responded, text captions appeared in tandem. Astra said "Adding a cache between the server and database could improve speed."

The tester then looked over to a pair of cats doodled on the board and asked "What does this remind you of?" Astra said "Schrodinger's cat." Finally, they picked up a plush tiger toy, put it next to a cute golden retriever and asked for "a band name for this duo." Astra dutifully replied "Golden stripes."

How does Project Astra work?

This means that not only was Astra processing visual data in realtime, it was also remembering what it saw and working with an impressive backlog of stored information. This was achieved, according to Hassabis, because these "agents" were "designed to process information faster by continuously encoding video frames, combining the video and speech input into a timeline of events, and caching this information for efficient recall."

It was also worth noting that, at least in the video, Astra was responding quickly. Hassabis noted in a blog post that "While we’ve made incredible progress developing AI systems that can understand multimodal information, getting response time down to something conversational is a difficult engineering challenge."

Google has also been working on giving its AI more range of vocal expression, using its speech models to "enhanced how they sound, giving the agents a wider range of intonations." This sort of mimicry of human expressiveness in responses is reminiscent of Duplex's pauses and utterances that led people to think Google's AI might be a candidate for the Turing test.

When will Project Astra be available?

While Astra remains an early feature with no discernible plans for launch, Hassabis wrote that in future, these assistants could be available "through your phone or glasses." No word yet on whether those glasses are actually a product or the successor to Google Glass, but Hassabis did write that "some of these capabilities are coming to Google products, like the Gemini app, later this year."

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/googles-project-astra-uses-your-phones-camera-and-ai-to-find-noise-makers-misplaced-items-and-more-172642329.html?src=rss

Categories: Technology

Google's new Gemini 1.5 Flash AI model is lighter than Gemini Pro and more accessible

Tue, 05/14/2024 - 12:23

Google announced updates to its Gemini family of AI models at I/O, the company’s annual conference for developers, on Tuesday. It’s rolling out a new model called Gemini 1.5 Flash, which it says is optimized for speed and efficiency.

“[Gemini] 1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more,” wrote Demis Hassabis, CEO of Google DeepMind, in a blog post. Hassabis added that Google created Gemini 1.5 Flash because developers needed a model that was lighter and less expensive than the Pro version, which Google announced in February. Gemini 1.5 Pro is more efficient and powerful than the company’s original Gemini model announced late last year.

Gemini 1.5 Flash sits between Gemini 1.5 Pro and Gemini 1.5 Nano, Google’s smallest model that runs locally on devices. Despite being lighter weight then Gemini Pro, however, it is just as powerful. Google said that this was achieved through a process called “distillation”, where the most essential knowledge and skills from Gemini 1.5 Pro were transferred to the smaller model. This means that Gemini 1.5 Flash will get the same multimodal capabilities of Pro, as well as its long context window – the amount of data that an AI model can ingest at once – of one million tokens. This, according to Google, means that Gemini 1.5 Flash will be capable of analyzing a 1,500-page document or a codebase with more than 30,000 lines at once.

Gemini 1.5 Flash (or any of these models) aren’t really meant for consumers. Instead, it’s a faster and less expensive way for developers building their own AI products and services using tech designed by Google.

In addition to launching Gemini 1.5 Flash, Google is also upgrading Gemini 1.5 Pro. The company said that it had “enhanced” the model’s abilities to write code, reason and parse audio and images. But the biggest update is yet to come – Google announced it will double the model’s existing context window to two million tokens later this year. That would make it capable of processing two hours of video, 22 hours of audio, more than 60,000 lines of code or more than 1.4 million words at the same time.

Both Gemini 1.5 Flash and Pro are now available in public preview in Google’s AI Studio and Vertex AI. The company also announced today a new version of its Gemma open model, called Gemma 2. But unless you’re a developer or someone who likes to tinker around with building AI apps and services, these updates aren’t really meant for the average consumer.

Catch up on all the news from Google I/O 2024 right here!

This article originally appeared on Engadget at https://www.engadget.com/googles-new-gemini-15-flash-ai-model-is-lighter-than-gemini-pro-and-more-accessible-172353657.html?src=rss

Categories: Technology

Main menu

Engadget

OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company

Google Project Astra hands-on: Full of potential, but it’s going to be a while

Engadget Podcast: The good, the bad and the AI of Google I/O 2024

X now treats the term cisgender as a slur

Everything announced at Google I/O 2024 including Gemini AI, Project Astra, Android 15 and more

Animal Well speedrunners are already beating the game in under five minutes

Google expands digital watermarks to AI-made video and text

Gemini will be accessible in the side panel on Google apps like Gmail and Docs

Google Gemini can power a virtual AI teammate with its own Workspace account

Google announces new scam detection tools that provide real-time alerts during phone calls

With Gemini Live, Google wants you to relax and have a natural chat with AI

Google's Gemini Nano brings better image-description smarts to its TalkBack vision tool

Google builds Gemini right into Android, adding contextual awareness within apps

Android's Circle to Search can now help students solve math and physics homework

Google's Gemini will search your videos to help you solve problems

Google Search will now show AI-generated answers to millions by default

Google unveils Veo and Imagen 3, its latest AI media creation models

Google just snuck a pair of AR glasses into a Project Astra demo at I/O

Google's Project Astra uses your phone's camera and AI to find noise makers, misplaced items and more.

Google's new Gemini 1.5 Flash AI model is lighter than Gemini Pro and more accessible

Pages

MSDN Features

MSDN News

Engadget

Microsoft

Technology

Meet the Press RSS

Main menu

ImageMover

Main menu

User login

You are here

Engadget

Pages

MSDN Features

MSDN News

Engadget

Microsoft

Technology

Meet the Press RSS

Main menu

ImageMover