Apple

1037 readers

110 users here now

There are a couple of community rules in addition to the main instance rules.

All posts must be about Apple

Anything goes as long as it’s about Apple. News about other companies and devices is allowed if it directly relates to Apple.

No NSFW content

While lemmy.zip allows NSFW content this community is intended to be a place for all to feel welcome. Any NSFW content will be removed and the user banned.

If you have any comments or suggestions please message one of the moderators.

founded 2 years ago

MODERATORS

Betawhat@lemmy.zip

Apple's new Siri will secretly use Google Gemini models behind the scenes (9to5mac.com)

submitted 1 day ago by nemeski@mander.xyz to c/apple@lemmy.zip

14 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] spicehoarder@lemmy.zip 9 points 20 hours ago (3 children)

Weird because Gemini scores the worst on my personal accuracy testing.

[–] luipaard0011@lemmy.zip 1 points 5 hours ago

Same as Siri then

[–] AlecSadler@lemmy.blahaj.zone 1 points 15 hours ago (1 children)

Same. Right now I rank it Claude, everything else except NaziAI, ChatGPT & Gemini, NaziAI.

[–] spicehoarder@lemmy.zip 3 points 7 hours ago (1 children)

I've found that IBM's Granite4 3b works the best for home assistant. Claude is my #1 for code generation.

[–] AlecSadler@lemmy.blahaj.zone 1 points 6 hours ago

Ooo, I haven't tried Granite4, I'll check it out.

[–] cornshark@lemmy.world 1 points 16 hours ago (1 children)

Do you suspect your personal accuracy testing is a lot better than the global lmsys arena leaderboard?

[–] spicehoarder@lemmy.zip 1 points 7 hours ago

My tests are very basic, but covers a lot of things relevant to my personal life. Gemini failed to correctly respond to a single one of my tests.

Gemini struggled the most with false information. Both creating falsehoods and accepting them without question. The most difficult thing for most LLMs was responding "I don't know" or "that's not right". It is my firm belief that any lack of knowledge should be identified and handled by RAG.