The notion of a "configuration space" is a way of translating object descriptions into object positions. It may seem like blue is "closer" to blue-green than to red, but how much closer? It's hard to answer that question by just staring at the colors. But it helps to know that the (proportional) color coordinates in RGB are 0:0:5, 0:3:2 and 5:0:0. It would be even clearer if plotted on a 3D graph.
In the same way, you can see a robin as a robin—brown tail, red breast, standard robin shape, maximum flying speed when unladen, its species-typical DNA and individual alleles. Or you could see a robin as a single point in a configuration space whose dimensions described everything we knew, or could know, about the robin.
A robin is bigger than a virus, and smaller than an aircraft carrier—that might be the "volume" dimension. Likewise a robin weighs more than a hydrogen atom, and less than a galaxy; that might be the "mass" dimension. Different robins will have strong correlations between "volume" and "mass", so the robin-points will be lined up in a fairly linear string, in those two dimensions—but the correlation won't be exact, so we do need two separate dimensions.
This is the benefit of viewing robins as points in space: You couldn't see the linear lineup as easily if you were just imagining the robins as cute little wing-flapping creatures.
A robin's DNA is a highly multidimensional variable, but you can still think of it as part of a robin's location in thingspace—millions of quaternary coordinates, one coordinate for each DNA base—or maybe a more sophisticated view that . The shape of the robin, and its color (surface reflectance), you can likewise think of as part of the robin's position in thingspace, even though they aren't single dimensions.
Just like the coordinate point 0:0:5 contains the same information as the actual HTML color blue, we shouldn't actually lose information when we see robins as points in space. We believe the same statement about the robin's mass whether we visualize a robin balancing the scales opposite a 0.07-kilogram weight, or a robin-point with a mass-coordinate of +70.
We can even imagine a configuration space with one or more dimensions for every distinct characteristic of an object, so that the position of an object's point in this space corresponds to all the information in the real object itself. Rather redundantly represented, too—dimensions would include the mass, the volume, and the density.
If you think that's extravagant, quantum physicists use an infinite-dimensional configuration space, and a single point in that space describes the location of every particle in the universe. So we're actually being comparatively conservative in our visualization of thingspace—a point in thingspace describes just one object, not the entire universe.
If we're not sure of the robin's exact mass and volume, then we can think of a little cloud in thingspace, a volume of uncertainty, within which the robin might be. The density of the cloud is the density of our belief that the robin has that particular mass and volume. If you're more sure of the robin's density than of its mass and volume, your probability-cloud will be highly concentrated in the density dimension, and concentrated around a slanting line in the subspace of mass/volume. (Indeed, the cloud here is actually a surface, because of the relation VD = M.)
"Radial categories" are how cognitive psychologists describe the non-Aristotelian boundaries of words. The central "mother" conceives her child, gives birth to it, and supports it. Is an egg donor who never sees her child a mother? She is the "genetic mother". What about a woman who is implanted with a foreign embryo and bears it to term? She is a "surrogate mother". And the woman who raises a child that isn't hers genetically? Why, she's an "adoptive mother". The Aristotelian syllogism would run, "Humans have ten fingers, Fred has nine fingers, therefore Fred is not a human" but the way we actually think is "Humans have ten fingers, Fred is a human, therefore Fred is a 'nine-fingered human'."
We can think about the radial-ness of categories in intensional terms, as described above—properties that are usually present, but optionally absent. If we thought about the intension of the word "mother", it might be like a distributed glow in thingspace, a glow whose intensity matches the degree to which that volume of thingspace matches the category "mother". The glow is concentrated in the center of genetics and birth and child-raising; the volume of egg donors would also glow, but less brightly.
Or we can think about the radial-ness of categories extensionally. Suppose we mapped all the birds in the world into thingspace, using a distance metric that corresponds as well as possible to perceived similarity in humans: A robin is more similar to another robin, than either is similar to a pigeon, but robins and pigeons are all more similar to each other than either is to a penguin, etcetera.
Then the center of all birdness would be densely populated by many neighboring tight clusters, robins and sparrows and canaries and pigeons and many other species. Eagles and falcons and other large predatory birds would occupy a nearby cluster. Penguins would be in a more distant cluster, and likewise chickens and ostriches.
The result might look, indeed, something like an astronomical cluster: many galaxies orbiting the center, and a few outliers.
Or we could think simultaneously about both the intension of the cognitive category "bird", and its extension in real-world birds: The central clusters of robins and sparrows glowing brightly with highly typical birdness; satellite clusters of ostriches and penguins glowing more dimly with atypical birdness, and Abraham Lincoln a few megaparsecs away and glowing not at all.
I prefer that last visualization—the glowing points—because as I see it, the structure of the cognitive intension followed from the extensional cluster structure. First came the structure-in-the-world, the empirical distribution of birds over thingspace; then, by observing it, we formed a category whose intensional glow roughly overlays this structure.
This gives us yet another view of why words are not Aristotelian classes: the empirical clustered structure of the real universe is not so crystalline. A natural cluster, a group of things highly similar to each other, may have no set of necessary and sufficient properties—no set of characteristics that all group members have, and no non-members have.
But even if a category is irrecoverably blurry and bumpy, there's no need to panic. I would not object if someone said that birds are "feathered flying things". But penguins don't fly!—well, fine. The usual rule has an exception; it's not the end of the world. Definitions can't be expected to exactly match the empirical structure of thingspace in any event, because the map is smaller and much less complicated than the territory. The point of the definition "feathered flying things" is to lead the listener to the bird cluster, not to give a total description of every existing bird down to the molecular level.
When you draw a boundary around a group of extensional points empirically clustered in thingspace, you may find at least one exception to every simple intensional rule you can invent.
But if a definition works well enough in practice to point out the intended empirical cluster, objecting to it may justly be called "nitpicking".
You could give relatively simple verbal intensional definitions to try and lead someone to the bird cluster, yes. But if you had someone who wasn't practically accessible through those verbal communications, how would you do it?
You'd have to show extensional examples, positives and negatives, and indicate the value of each example by some clear and consistent signal.
You couldn't give all possible extensional examples, so you would have to select some. And you couldn't give them all at once, so you'd have to present them in a particular order.
What is the theory for finding optimized selections and orderings of examples for leading the learner to the cluster? How does that theory extend to the more complicated case where you have to communicate the subtypes within the "bird" cluster?
This is one of the many things that the Theory of Direct Instruction that's presented in Engelmann and Carnine's text Theory of Instruction: Principles and Applications addresses. [They call it a "multi-dimensional non-comparative concept" (non-comparative" meaning the value of any example is absolute rather than relative to the last), or "noun" for short.]
And of course, if you had to select and order the presentation of simple verbal definitions/descriptions as examples themselves, the theory would also have application.
Please see here for a clarification of what "someone who wasn't practically accessible through those verbal communications" means, and a more concrete example of teaching the higher-order class 'vehicles' and sub-classes.