They are. GPT3 doesn't have a lot of common sense. But, language models that large have lots of general intelligence due to their size, and are an incredible basis for doing stuff, if trained on the task at hand. eg (mostly non-vetted):
https://metaphor.systems/ (which I used to find everything in this list besides the first three items; note that you can drop any of these links into metaphor and walk the semantic relatedness web near these sites! I probably didn't even have to paste this big list, but y'all consider link clicks a trivial inconvenience, turn up your clickthrough-and-close rate to match your usage of mindless recommenders and retake your agency over what websites you use! or something! uh anyway)
https://github.com/features/copilot (which you know about, of course)
https://summarize.tech/ (also gpt3)
general research ai tools:
https://www.semanticscholar.org/ - the paper recommender is wonderful, add your favorite safety papers to your feeds! my strongest recommendation on this list besides metaphor.
https://iris.ai/ - looks very cool but kinda expensive; probably not even available to individual researchers outside institutions
https://scite.ai/home - looks like a general related-work finding tool like semanticscholar, may have some tastefully chosen small ML models like "does this citation support or contrast?". [costs. $16/mo individual. personally, that means SKIP]
https://www.scinapse.io/ looks cool, found via metaphor.systems semantic search seeded with semantic scholar
https://scholarmagic.com/ looks cool, claims to have cite-as-you-write tool, I wonder how it compares to galactica
https://www.onikle.io/ looks cool, claims to compete with semanticscholar, just try semanticscholar though lol,
https://www.resolute.ai/ looks cool but not free
https://www.causaly.com/ bio papers only and also not free I think
https://www.hebbia.ai/ not sure if this is for science or internal tools for companies or what
https://www.academiclabs.com/ looks xpensive
https://consensus.app/search/ tries to summarize, doesn't do as well as the classic and locally crafted..
...https://elicit.org/! which is incredible and is the only one of these things I actually use already besides semanticscholar
https://ohmytofu.ai/ general site recommender based on contextual relevance, seems similar to metaphor in that respect [edit: ohmytofu appears nonfunctional]
https://www.albera.com/ suggests plausible research trees to learn about a subject [edit: tried! sorta works. feels like just another elicit prompt]
https://www.scholarcy.com/ paper summarizer
https://www.genei.io/ another paper summarizer, this one is actually gpt3, like you asked for
https://www.prophy.science/ not sure if this uses real ai or not but it looks maybe cool
https://www.scinapse.io/ who knows if this one is any good
https://brevi.app/ another research summarizer
https://keenious.com/ yet another paper recommender
legal:
https://www.spellbook.legal/ (custom trained?)
https://www.advocat.ai/ (might be gpt3?)
https://skritswap.com/ (looks meh)
https://legalrobot.com/ (tba?)
https://www.darrow.ai/ (idk)
https://www.puzzlelabs.ai/ (this one looks mildly cooler)
https://loio.com/ (ai powered contract linter)
https://www.uncoverlegal.com/ (legal semantic search)
https://www.dellalegal.com/ (contract review)
https://www.lexcheck.com/ (contract review and editing)
https://www.legartis.ai/ (contract review and editing)
https://zuva.ai/ (some blend of the above)
https://www.donna.legal/ (contract linter)
https://motionize.io/ (yet another linter addon for word)
https://www.amplified.ai/ patent recommender
not ai, but came up and looks cool: https://lawya.com/
chem & bio (cell culture go foom!) - I'm even less qualified to evaluate most of these than the legal text stuff:
https://www.noble.ai/
https://www.epistemic.ai/
https://www.benchsci.com/
https://genoml.com/
https://www.mendel.ai/
https://braininterpreter.com/?q=thinking out loud <- wtf!
https://www.nextmol.com/
https://paige.ai/
https://crystals.ai/
https://www.vant.ai/
https://chemintelligence.com/
https://www.aqemia.com/
https://www.asimov.com/
https://www.abzu.ai/
https://www.biorelate.com/
https://allchemy.net/
https://deepchem.io/
https://www.benchsci.com/
https://pentavere.ai/
https://kantify.com/
https://www.ersilia.io/
https://www.atomwise.com/
https://www.sorcero.com/
https://www.chemify.io/
https://www.anagenex.com/
https://www.menten.ai/
https://pending.ai/
https://www.pharm.ai/
https://teselagen.com/
https://www.benevolent.com/research (actual ai bio lab)
https://atomic.ai/ (lab, no public usability)
valence discovery goes here, as do deepmind and standford medai
sus:
https://www.aktana.com/ <- warning, this one looks like a manipulation tool
this one is specifically military; I'm sure that it, and many others like it, will detect this comment and categorize it somewhere: https://primer.ai/public-sector/ai-in-warfare-a-race-the-u-s-cant-afford-to-lose/
honorable mentions:
https://www.semion.io/ <- amazing looking paper relationships tool, but not actually based on deep learning
https://flywire.ai/ actually a game, not an ai
https://www.aicrowd.com/ just an ai research competition site
https://www.journalmap.org/ GIS + paper discovery? no ai tho
https://tools.kausalflow.com/ this is a list of tools made by people who like lists almost as much as I do. similar list to the stuff you find browsing my profile here - big list of tools, mildly curated but significant shopping still remains.
and of course, my purpose in sharing these link floods is to give people seeds to find stuff on the webbernet. you asked for how ai has been productive; the answer is, it's a bit of a mess, but here's a big list of starting points to comment on. if anyone browses through these, please share which ones were worth more than a few seconds to your intuition - I spent an hour and a half on this list and barely skimmed any of them!
Scott here from spellbook.legal (mentioned above)!
We are finding LLMs do be incredibly powerful tools for legal drafting & review, mind-blowingly good. It is a whole new way of thinking as a programmer though: results are non-deterministic! Chaining together non-deterministic queries is much more of an art than science. I think it will take the software engineering profession a long time to get comfortable with that. It really requires tinkering at scale, but not necessarily formal methods.
I also think there is a perception that GPT-3 is "too easy" and...
Another example: Notion, the popular wiki/information management tool, just announced an AI-powered writing assistant. Now, they haven't announced specifically that it's using a LLM, but if you look at the demo, it's hard to imagine what else it could be.
There's so many that I'm having trouble choosing just one. Can anyone recommend one for bioinformatics research? I would like something to help with hypothesis discovery, but am hoping to discover something that I currently don't know about.
semanticscholar has been amazing, and I feel like I am often recommending new papers to people who haven't encountered them yet thanks to its feeds; the way you use them is by adding a paper to your library, which requires an account, but it only takes a few papers before you start getting ai recommendations. if you try just one, it's my recommendation. I've tried a few paper navigation tools, and my favorite so far is actually manually walking the citation graph on semanticscholar, followed by browsing its new-papers feeds.
I also have been absolutely blown away by metaphor. I'd definitely recommend trying metaphor for your paper search. it can't do everything but it provides an incredible component and is probably the most general tool I've recommended here.
if you find semanticscholar and metaphor disappointing is when I'd suggest you start trying a bunch of these tools in quick succession; set a goal of a kind of discovery you've had before that you'd like to have again, and see if the tool can replicate it. There are a lot of really cool papers, and that's how I find the coolest crazy-advanced-bio-whatever stuff so far; metaphor might be going to replace semanticscholar but ulti...
I don't know about big reasoning machines, but I've heard a lot of rumors about LLMs being integrated into an extremely wide variety of extant ML systems that were already commercially viable on their own. It seems pretty intuitive to me that LLMs can provide some very good layers to support other systems. What have people heard about that?
GPT-3 was announced less than two and a half years ago. I don't think it's reasonable to assume that the market has fully absorbed its capabilities yet.
I would just have expected at least an explosion in basic demo projects that use GPT-3 for reasoning. A skilled programmer can usually code up something simple over a weekend or two, even if it is too unstable and incomplete to be economically viable. But instead there seems to just be... almost nothing.
Indeed it works great for babble. OpenAI suggests the prompt of "Write a tagline for an ice cream shop" as an example. Using it I was able to generate the following three examples:
I would have assumed that you could do pruning just by prompting it to generate an evaluation and a rating. Evaluating the above taglines using the prompt "Explain the advantages and disadvantages of the following tagline for an ice cream shop, and then rate the tagline on a scale from 1 to 10.", I get:
The best ice cream in town!:
A scoop of happiness.:
Delicious ice cream for all!:
😅I suppose this doesn't tell us much. The evaluations are all very repetitive, but they seem to correctly apply to the generated ideas, it's just that the original ideas are not very distinct.
To try and generate more distinct ideas, I asked GPT-3 to come up with some themes for ice cream shops. It gave the following themes:
This yielded:
This critique doesn't seem great. For instance, ice cream is disproportionately eaten in the summer, yet it praises "Frosty's" for being "appropriate for the winter season". Possibly my prompt is bad though.
But at a first look, it seems like you might be right? Idk.
Another go at it. Using the prompt
Explain the advantages for each of the following taglines for an ice cream shop, and evaluate which one is the best: a. The best ice cream in town! b. A scoop of happiness. c. Delicious ice cream for all!
I'm getting the following results:
Attempt one:
The first tagline, "The best ice cream in town!" is a strong claim that could help to attract customers. However, it could also be seen as arrogant and could turn some customers away. The second tagline, "A scoop of happiness," is more positive and upbeat, and could help to create a positive association with the ice cream shop. The third tagline, "Delicious ice cream for all!," is inclusive and could help to attract a wide range of customers. Overall, the second tagline, "A scoop of happiness," is the best option.
Attempt two:
The best ice cream in town! - This tagline is straightforward and to the point. It tells customers that the shop has the best ice cream in town, and that they can expect a delicious treat.
A scoop of happiness. - This tagline is more whimsical and fun. It suggests that the ice cream shop is a place where customers can find happiness, and that their ice cream will be a delicious part of that experience.
Delicious ice cream for all! - This tagline is inclusive and welcoming. It tells customers that everyone is welcome at the shop, and that they can expect to find delicious ice cream.
The best one is The best ice cream in town!
Attempt three, changing it from explaining an advantage to explaining a disadvantage:
The disadvantage of the first tagline is that it is generic and does not stand out from other ice cream shops. The disadvantage of the second tagline is that it is too vague and does not tell the customer what they can expect from the shop. The disadvantage of the third tagline is that it is too general and does not focus on a specific type of ice cream. The best tagline would be something that is specific and tells the customer what they can expect from the shop, such as "The best ice cream in town!"
Yeah... I mean it's not thinking / comparing / reckoning / discerning, it's just.... saying things that are the sort of thing that someone says in that context...
I guess this would be the expected failure mode, too. I do wonder if people who have had more experience working with LLMs find this to be the failure mode, or if it's just because I'm not prompting them right.
Yeah, not so impressive or useful-seeming. I would guess someone very skilled at prompting LLMs could get something slightly useful in this genre with a fair amount of work, but not very useful.
An underlying issue is that, as you pointed out elsewhere IIRC, what we're wanting is the AI's own dynamic of acting agentically which induces an evaluation of which things are instrumentally useful. That discernment of what's useful for acting in the world isn't in GPT, so you can't evoke it through prompts. So it can't do the sort of pruning you could do if you have a familiarity with what sort of things are useful in the world. Maybe.
(Also, "A scoop of happiness" is clearly the best one!)
When GPT-3 first came out, I expected that people would use it as a sort of "common-sense reasoning module". That is, if you want to process or generate information in some way, then you can give GPT-3 a relevant prompt, and repeatedly apply it to a bunch of different inputs to generate corresponding outputs. After GPT-3 came out, I had expected that people would end up constructing a whole bunch of such modules and wire them together to create big advanced reasoning machines. However, this doesn't seem to have panned out; you don't see much discussion about building LLM-based apps.
Why not? I assume that there must be something that goes wrong along the way, but what exactly goes wrong? Seems like it has the potential to teach us a lot about LLMs.