Have you tried using different AI models within perplexity? Any ideas about which one is best? I don't know whether to expect better results from Sonnet 3.5 (within perplexity) or one of the models that perplexity have finetuned themselves, like Sonar Huge.
These results are all from the vanilla UI. Comparing individual models on harder tasks is on my maybe list, but the project was rapidly suffering from scope creep so I rushed it out the door.
FWIW I have used Perplexity twice since you mentioned it, it was somewhat helpful both times, but also, both times the citations had errors. By that I mean it would say something and then put a citation number next to it, but what it said was not in the cited document.
I got my first hallucination shortly after posting this- it's definitely not perfect. But I still find the ease of checking a big improvement over other models.
but I haven’t ended up finding Pro to be much of an improvement
Oh. Then I'm surprised by this position. I find perplexity basic to be so superficial that I'd usually prefer to start with Claude even knowing it can't cite anything and frequently makes errors.
All perplexity basic seems to do is google search and then summarize the contents of the results, which sometimes reduces the friction of researching something, but we know that google searches are not very thorough and often miss a lot of important stuff. I was hoping pro had more going on.
I assume you're using claude pro? Because I found the top free version unusable.
Could you post some questions you've run on both and their answers?
Pro and free are currently using the same model.
Sometimes I use it for finding examples of things. Perplexity is actually not good at finding things.
EG:
What are some single player games that kept some logic on company servers so that players couldn't figure out secrets by decompiling the game code?
Perplexity: [looks at the pages you'd get if you just ran that as a google search] While there are no specific examples in the search results, I'm going to say some shit about the idea of doing that which very few people asking this question would need.
Claude: Spore, Diablo III, SimCity, No Man's Sky, Assassin's Creed Origins, Hitman [actually understands and engages with the question]
Oh, I just double-checked the claim about no man's sky (and spore), and it almost certainly isn't true o_o
Though the reason it gave, "preserving a sense of mystery of exploration" would have been a really good application for this, and I am kind of surprised they didn't do it. Which at least partially satisfied my query. So still a somewhat useful example.
what are the rates of the most common intellectual disabilities in childhood
perplexity: [boring stuff, doesn't list the disabilities]
claude: [lists some disabilities and] Fetal Alcohol Spectrum Disorders (FASD): Estimated 2-5% of school-age children in the US [o}o !!!]
And I was able to corroborate that claim and this has substantially impacted my worldview. It's the most common cause of childhood intellectual disability. I then looked up the FASD subreddit and had a real heartwrenching time.
Another weird example
Is natto considered mogumogu
perplexity: yes
claude: no, mogumogu means chewy [ongoing conversation] oh you're thinking of neba-neba
And I can't imagine having a conversation with Perplexity in this way, though I'm not sure why it's so bad at that. They seem to have made it so that it forgets all of the context in followup questions.
I often feel like Perplexity's LLM parts, the clever parts, the synthesis, is flattened away, all it's allowed to do is recite.
Huh, yeah, does seem like Claude was the winner there. I reproed the intellectual disability answers and got the same results you did. I was able to get a better answer from Perplexity with a slight rephrase, but I hate having to play the rephrase game with AIs so that's a modest mitigator. And the answer was not internally consistent
I think the difference between us might be that I do primarily want a search engine, and perhaps my natural phrasing works better with Perplexity.
Have you tried exa.ai? Maybe that's the crux, it's doing semantic search, perplexity doesn't seem to be, so exa maybe takes over its niche and also makes me kinda mad at it for not doing the most transformative thing these engines could be doing.
Interesting that perplexity also doesn't put FASD at the top despite it being so common.
Perplexity seems to be significantly more effective than other competitive models when it comes to acting as a research device/answer engine. This is mainly because that is its main use-case whereas other models such as Claude by itself and ChatGPT excel in other areas. I do believe that Perplexity's citation techniques could be some of the first baby steps to far(possibly near)future automated ai research.
Perplexity is the first generalized AI chatbot I’ve found useful enough to integrate into any part of my daily workflow, much less across multiple domains. It speeds me up enough that I’m planning an increase in my freelancing rate.
Perplexity has three key advantages:
The citations are a bigger deal than they sound at first. I could of course google anything ChatGPT tells me and check the results, but that’s costly. I have to spin up a new mental thread (and browser tab), sort through Google’s increasingly shitty results, remember what claim I’m trying to check… the process eats up a lot of working memory at a time when it is scarce. If I wanted to trawl a sea of content farms I would have gone to Google in the first place.
Perplexity’s high quality inline citations solve that . Want to check if a claim is justified? Click the number next to it and look for the relevant quote. Not a perfect substitute for a 30 hour literature search, but a lot of power per minute. I don’t have to spin up a new mental thread, so the whole thing feels much lower friction. Also thus far I’ve never caught it in a hallucination on a question of fact, although if real answers are scarce it will sometimes do the thing where it draws increasingly tenuous connections instead of just telling me no.
I also found perplexity easier to talk to while trying to learn some math and math history. It felt about as useful as my human tutor- better at some things, worse at others.
Test Questions
The real test was which AI do I find myself using, and the answer to that is definitively Perplexity. But just for fun or in case I was wrong, I ran a few questions Perplexity through all major engines to see what they said.
In all cases I used the top free model available. I have Perplexity Pro thanks to a promo, but I haven’t ended up finding Pro to be much of an improvement. I do expect to use the pro-only PDF-analysis tool in the near future.
Comments from me are in italics and [brackets]
How do fractals relate to chaos theory?
I have a project investigating chaos theory as a case study in the development of scientific/mathematical paradigms. This doesn’t require I become a mathematician, but I do need to understand how discoveries relate to each other and generally not be an idiot on the topic. Perplexity is my always available, infinitely patient tutor.
This should have been a challenging role for an AI to fill, because I have a friend who loves explaining math more than almost anything, and did his PhD in a related area. I even feel comfortable looking stupid in front of him. And yet Perplexity still had a value add, because it was never offline and I didn’t feel a need to respect its time the way I would a friend’s.
Perplexity (link to result including citations)
Fractals and chaos theory are closely intertwined concepts in mathematics and physics, with several key connections:
Self-Similarity and Complex Patterns
Fractals are geometric objects that exhibit self-similarity across different scales, meaning smaller parts resemble the whole. This property is often found in chaotic systems, where complex patterns emerge from simple rules.
Visualization of Chaotic Processes
Fractals frequently arise as visual representations of chaotic systems:
Characteristics of Complex Systems
Both fractals and chaos theory deal with complex systems that exhibit:
Applications in Financial Markets
In financial analysis, fractals and chaos theory are used to understand market behavior:
[-1 for bringing up markets. That’s a use of both, but lots of things use two things without those things being related ]
Relationship to Complexity Theory
[This seems irrelevant but I think was prompted by context from previous questions]
While chaos and fractals are related, they are distinct concepts:
Understanding the difference between chaos and fractals is crucial for properly characterizing complex systems like financial markets.
In essence, fractals often emerge as visual representations or outcomes of chaotic systems, providing a way to understand and analyze the complex, non-linear behavior that chaos theory describes. Both concepts contribute to our understanding of intricate, seemingly random phenomena in nature and various scientific fields.
ChatGPT 4o
In summary, fractals are often the visual and mathematical representations of the complexity inherent in chaotic systems. Chaos theory helps explain the underlying principles that lead to the formation of fractals, and fractals provide a way to visualize and understand the seemingly unpredictable nature of chaos.
Claude
Fractals and chaos theory are interconnected concepts in mathematics that deal with complex systems and their behavior. Here’s a concise overview of their relationship:
[points to Claude on this one. It’s an important point that the others missed]
[again, two things being useful for a third thing doesn’t make them related]
Would you like me to elaborate on any of these points or discuss a specific aspect of the relationship between fractals and chaos theory?
Google Search AI Summary
What a cool set of random facts with the word “fractal” in them, Google.
How long after antibiotics is bordatello not contagious in cats?
[note: misspelled, the real spelling is Bordatella]
I was supposed to adopt a kitten at the beginning of August. On the day I was supposed to take him home, the breeder found Bordatella in another cat in the home (this is the bacteria that causes kennel cough in dogs and whooping cough in humans; it doesn’t have a common name in cats). It was early in the morning, this was a stressful topic and, due to travel times, I had an hour to figure out if I would take him that day (and risk spreading Bordatella to my existing elderly cat) or wait. Perplexity really hit this one out of the park, but it was a complicated back and forth that doesn’t make for a good example here.
I’m instead going to share this question I asked two weeks later, when I was trying to figure out if it was safe to bring the symptomatic-but-in-treatment kitten home.
ChatGPT was especially bad here. It was factually incorrect and spent 4 paragraphs reminding me to respect authority and eat my vegetables.
Perplexity (link to results including citations)
Based on the search results, there is no specific information about how long Bordetella remains contagious in humans after antibiotic treatment. However, we can draw some relevant insights from the information about Bordetella in cats and humans:
While these points don’t provide a definitive answer for humans, they suggest that the contagious period may be reduced to around 5-10 days after starting antibiotic treatment [this is what the source says, although I haven’t verified with other sources]. However, it’s important to note that this is an extrapolation from limited information, and the actual duration may vary depending on the specific Bordetella species and individual factors.
For accurate information about the contagious period of Bordetella in humans after antibiotic treatment, it would be best to consult with a medical professional or refer to guidelines from health authorities.
ChatGPT-4o
Bordetella bronchiseptica, the bacterium responsible for respiratory infections in humans, is less common in humans than in animals but can still be transmitted. The duration of contagiousness after starting antibiotics can vary depending on the individual, the severity of the infection, and the specific antibiotic used.
Generally, with many bacterial infections, including Bordetella, a person is often considered less contagious after 24 to 48 hours of appropriate antibiotic therapy [this is contradicted by the only source I checked, and dangerous. If ChatGPT gave sources maybe I could compare them]. This period allows the antibiotic to reduce the bacterial load to a level where the risk of transmission is significantly lowered.
However, it’s important to follow these guidelines:
If you or someone you know is dealing with Bordetella bronchiseptica, it’s best to consult a healthcare professional for personalized advice on managing the infection and understanding when it is safe to resume normal activities without the risk of spreading the bacteria.
Claude 3.5
The contagious period for Bordetella (also known as kennel cough) in cats typically decreases significantly after starting antibiotic treatment. However, the exact timeline can vary. Here’s a general guideline:
It’s important to note that individual cases may vary, and you should follow your veterinarian’s specific advice. They may recommend a different isolation period based on the severity of the infection and the cat’s response to treatment.
If you need more specific information about your cat’s situation, I recommend consulting with your veterinarian directly. They can provide tailored advice based on your pet’s individual case and current condition.
[This is a better written result than Perplexity but doesn’t cite sources and isn’t reliable enough to trust so doesn’t save me much work]
Google Search AI summary
No AI response, first page of results did not answer the question
Is the Aegis Shield Playful?
This one is kind of goofy; I was considering naming the new kitten Aegis but also wanted a name to signal playfulness. I knew the answer was going to be “no, are you insane?” but Perplexity made it feel cheap to check.
All three engines got the right answer, but Perplexity took a stand and ChatGPT weaseled to avoid telling me no.
Perplexity
The term “playful” is subjective, but the Aegis Shield in gaming contexts often refers to its protective capabilities rather than a playful nature. For instance, in Final Fantasy, it provides strong defense against various status effects, while in Fire Emblem, it is valued for its strategic advantages despite some drawbacks. Overall, the Aegis Shield is more about utility and defense than playfulness.
ChatGPT 4o
The Aegis shield, rooted in Greek mythology, is not typically associated with playfulness. It is more often connected with protection, power, and divine authority. The Aegis was originally described as a protective cloak or shield carried by Zeus and Athena, symbolizing their might and the fear they could instill in their enemies.
However, in a creative or metaphorical sense, you could interpret the Aegis shield as playful if it’s part of a story, game, or artwork that reimagines it in a lighter or more whimsical context. Typically, though, its traditional associations are with seriousness, protection, and strength rather than playfulness.
Claude 3.5
I apologize, but I’m not familiar with a specific item or concept called an “Aegis Shield Playful.” There could be a few possibilities:
To provide an accurate answer, I would need more context about where you encountered this term or what specific item or concept you’re referring to. Could you provide more details or clarify the context of your question?
[Correct, but miserable to read]
Google Search AI summary
No AI response
Conclusion
I’m jinxing things a little by writing this after 3-4 weeks of work, but Perplexity has genuinely sped up my work in ways no other chatbot has managed. I am both delighted and scared of where we are on the AGI curve.
Acknowledgements and (lack of) COI
This post was financially supported by my Patreon patrons (thanks guys!) but not by any of the companies in question. Unless you count Uber One giving me a Perplexity Pro subscription that didn’t feature in this review. I have friends who work at Anthropic (which produces Claude) and a few other AI orgs, but AFAIK no one at Perplexity.