James Camacho

Wiki Contributions

Comments

Sorted by

I claim that there's just always a distribution over meanings, and it can be sharp or fuzzy or bimodal or any sort of shape.

The issue is you cannot prove this. If you're considering any possible meaning, you will run into recursive meanings (e.g. "he wrote this") which are non-terminating. So, the truthfulness of any sentence, including your claim here is not defined.

You might try limiting the number of steps in your interpretation: only meanings that terminate to a probability within steps count; however, you still have to define or believe in the machine that runs your programs.

Now, I'm generally of the opinion that this is fine. Your brain is one such machine, and being able to assign probabilities is useful for letting your brain (and its associated genes) proliferate into the future. In fact, science is really just picking more refined machines to help us predict the future better. However, keep in mind that (1) this eventually boils down to "trust, don't verify", and (2) you've committed suicide in a number of worlds that don't operate in the way you've limited yourself. I recently had an argument with a Buddhist whose point was essentially, "that's the vast majority of worlds, so stop limiting yourself to logic and reason!"

Natural languages, by contrast, can refer to vague concepts which don’t have clear, fixed boundaries

 

I disagree. I think it's merely the space is so large that it's hard to pin down where the boundary is. However, language does define natural boundaries (that are slightly different for each person and language, and shift over time). E.g., see "Efficient compression in color naming and its evolution" by Zaslavsky et al.

"The boundary of a boundary is zero"

 

I think this is mostly arbitrary.

So, in the 20th century Russel's paradox came along and forced mathematicians into creating constructive theories. For example, in ZFC set theory, you begin with the empty set {}, and build out all sets with a tower of lower-level sets. Maybe the natural numbers become {}, {{}}, {{{}}}, etc. Using different axioms you might get a type theory; in fact, any programming language is basically a formal logic. The basic building blocks like the empty set, or the builtin types are called atoms.

In algebraic geometry, the atom is a simplex—lines in one dimension, triangles in two dimensions, tetrahedrons in three dimensions, and so on. I think they generally use an axiom of infinity, so each simplex is infinitely small (convenient when you have smooth curves like circles), but they need to be defined at the lowest level. This includes how you define simplices from lower-dimensional simplices! And this is where the boundary comes in.

Say you have a triangle (2-simplex) [A, B, C]. Naively, we could define it's boundary as the sum of its edges:

However, if we stuck two of them together, the shared edge [A, C] wouldn't disappear from the boundary:

This is why they usually alternate sign, so

Then, since

you could also write it like

It's essentially a directed loop around the triangle (the analogy breaks when you try higher dimensions, unfortunately). Now, the famous quote "the boundary of a boundary is zero" is relatively trivial to prove. Let's remove just the two indices $A_i, A_j$ from the simplex $[A_1, A_2, \dots, A_i, \dots, A_j, \dots, A_n]$. If we remove $A_i$ first, we'd get

while removing $A_j$ first gives

The first is $-1$ times the second, so everything will zero out. However, it's only zero because we decided edges should cancel out along shared boundaries. We can choose a different system where they add together, which leads to the permanent as a measure of volume instead of the determinant. Or, one that uses a much more complex relationship (re: immanent).

I'm certainly not an expert here, but it seems like fermions (e.g. electrons) exchange via the determinant, bosons (e.g. mass/gravity) use the permanent, and more exotic particles (e.g. anyons) use the immanent. So, when people base their speculations on the "boundary of a boundary" being a fundamental piece of reality, it bothers me.

That assumes the law of non-contradiction. I could hold the belief that everything will happen in the future, and my prediction will be right every time. Alternatively, I can adjust my memory of a prediction to be exactly what I experience now.

Also, predicting the future only seems useful insofar as it lets the belief propagate better. The more rational and patient the hosts are, the more useful this skill becomes. But, if you're thrown into a short-run game (say ~80yrs) that's already at an evolutionary equilibrium, combining this skill with the law of non-contradiction (i.e. only holding consistent beliefs) may get you killed.

Within a system with competition, why would the most TRUE thing win? No, the most effective thing wins.

 

Why do you assume truth even exists? To me, it seems like there are just different beliefs that are more or less effective at proliferation. For example, in 1300s Venice, any belief except Catholicism would destroy its hosts pretty quickly. Today, the same beliefs would get shouted down by scientific communities and legislated out of schools.

Answer by James Camacho8-2
  1. Is this apparent parity due to a mass exodus of employees from OpenAI, Anthropic, and Google to other companies, resulting in the diffusion of "secret sauce" ideas across the industry?

No. There isn't much "secret sauce", and these companies never had a large amount of AI talent to begin with. Their advantage is being in a position with hype/reputation/size to get to market faster. It takes several months to setup the infrastructure (getting money, data, and compute clusters), but that's really the only hurdle.

  1. Does this parity exist because other companies are simply piggybacking on Meta's open-source AI model, which was made possible by Meta's massive compute resources? Now, by fine-tuning this model, can other companies quickly create models comparable to the best?

No. "Everyone" in the AI research community knew how to build Llama, multi-modal models, or video diffusion models a year before they came out. They just didn't have $10M to throw around.

Also, fine-tuning isn't really the way to go. I can imagine people using it as a teacher during the warming up phase, but the coding infrastructure doesn't really exist to fine-tune or integrate another model as part of a larger one. It's usually easier to just spend the extra time securing money and training.

  1. Is it plausible that once LLMs were validated and the core idea spread, it became surprisingly simple to build, allowing any company to quickly reach the frontier?

Yep. Even five years ago you could open a Colab notebook and train a language translation model in a couple of minutes.

  1. Are AI image generators just really simple to develop but lack substantial economic reward, leading large companies to invest minimal resources into them?

No, images are much harder than language. With language models, you can exactly model the output distribution, while the space of images is continuous and much too large for that. Instead, the best models measure the probability flow (e.g. diffusion/normalizing flows/flow-matching), and follow it towards high-probability images. However, parts of images should be discrete. You know humans have five fingers, or text has words in it, but flows assume your probabilities are continuous.

Imagine you have a distribution that looks like

__|_|_|__

A flow will round out those spikes into something closer to

_/^\/^\/^\__

which is why gibberish text or four-and-a-half fingers appear. In video models, this leads to dogs spawning and disappearing into the pack.

  1. Could it be that legal challenges in building AI are so significant that big companies are hesitant to fully invest, making it appear as if smaller companies are outperforming them?

Partly when it comes to image/video models, but this isn't a huge factor.

  1. And finally, why is OpenAI so valuable if it’s apparently so easy for other companies to build comparable tech? Conversely, why are these no name companies making leading LLMs not valued higher?

I think it's because AI is a winner-takes-all competition. It's extremely easy for customers to switch, so they all go to the best model. Since ClosedAI already has funding, compute, and infrastructure, it's risky to compete against them unless you have a new kind of model (e.g. LiquidAI), reputation (e.g. Anthropic), or are a billionaire's pet project (e.g. xAI).

Religious freedoms are a subsidy to keep the temperature low. There's the myth that societies will slowly but surely get better, kind of like a gradient descent. If we increase the temperature too high, an entropic force would push us out of a narrow valley, so society could become much worse (e.g. nobody wants the Spanish Inquisition). It's entirely possible that the stable equilibrium we're being attracted to will still have religion.

Can't you choose an arbitrary encoding procedure? Choosing a different one only adds a constant number of bits. Also, my comment on discounted entropy was a little too flippant. What I mean is closer to entropy rate with a discount factor, like in soft-actor critic. Maximizing your ability to have options in the future requires a lot of "agency".

Maybe consciousness should be more than just agency, e.g. if a chess bot were trained to maximize entropy, games wouldn't be as strategic as if it wants to get a high*-energy payoff. However, I'm not convinced energy even exists? Humans learn strategy because their genes are more likely to survive, thrive, and have choices in the future when they win. You could even say elementary particles are just the ones still around since the Big Bang.

*Note: The physicists should reverse the sign on energy. While they're at it, switch to inverse-temperature.

Consider all programs encoding isomorphisms from a rock to something else (e.g. my brain, or your brain). If the program takes bits to encode, we add times the other entity to the rock (times some partition number so all the weights add up to one). Since some programs are automorphisms, we repeatedly do this until convergence.

The rock will now possess a tiny bit of consciousness, or really any other property. However, where do we get the original "sources" of consciousness? If you're a solipsist, you might say, "I am the source of consciousness." I think a better definition is your discounted entropy.

Answer by James Camacho10

An isomorphism isn't enough. Stealing from Robert (Lastname?), you could make an isomorphism from a rock to your brain, but you likely wouldn't consider it "conscious". You have to factor out the Kolmogorov complexity of the isomorphism.

Load More