Really, I'm proposing doing something that has to be much easier than the problem people seem to get fixated on, i.e. the general purpose question answering machine, which was a staple of science fiction decades ago (leading to the parody Q: "What is the meaning of life? A: 42). Besides which the goal of one crisp answer seems aimed at a childish mentality -- except in the (rarer than we may think) cases when there really is one crisp answer.
After all, I wrote "What occurred to me though, is that computer science could do something quite useful intermediate between "general purpose question answerer" and the old database paradigm of terms ANDed or ORed together."
Between two people, either there would be some implicit understanding (like We're talking about pre-USSR because that's the subject of the seminar we're in) or the question-ee might have to say "What are the parameters of the Russia you're talking about?"
Then again the semi-smart search engine I'd like to see could just decline to resolve ambiguities, and return all articles treating any reasonable interpretation of the phrase, and make it the user's job to add qualifying phrases as needed.
I am dreaming up a Simpson's episode in which a computer can convince a panel of experts that it is Bart Simpson, and the ensuing debate as to whether that was "passing the Turing Test".
the semi-smart search engine I'd like to see could just decline to resolve ambiguities
Why? That seems really unhelpful. I'd much prefer the engine to answer like a human expert, who habitually start with "That depends on what you mean by..." I imagine it could assess its confidence in its choices for interpretation, discard any with less than (say) 10% probability, and if more than one remains, give them to me in an ordered list, to click on the one I mean. Kind of like a wikipedia disambiguation page. (If more than one term need clarification...
Some time ago, I was following a conversation about Wolfram Alpha (http://www.wolframalpha.com/), an attempt to implement a sort of general purpose question answerer, something people have dreamed about computers doing for decades. Despite the theoretical availability to find out virtually anything from the Internet, we seem pretty far from any plausible approximation of this dream (at least for general consumption). My first attempt was:
Q: "who was the first ruler of russia?"
A: Vladimir Putin
It's a problematic question that depends on questions like "When did Russia become Russia", or "What do we count, historically as Russia", or even what one means by "Ruler", and a reasonably satisfactory answer would have had to be fairly complicated -- either that, or the question would have to be reworded to be so precise that one name could serve as the answer.
On another problematic question I thought it did rather well:
Q: what is the airspeed velocity of an unladen african swallow?
What occurred to me though, is that computer science could do something quite useful intermediate between "general purpose question answerer" and the old database paradigm of terms ANDed or ORed together. (Note that what Google does is neither of these, nor should it be placed on a straight line between the two -- but discussion of Google would take me far off topic).
A simple example of what I'd really like is a search engine that matches *concepts*. Does anyone know of such a thing? If it exists, I should possibly read about it and shut up, but let me at least try to be sure I'm making the idea clear:
E.g., I'd like to enter <<rulers of russia>>, and get a list of highly relevant articles.
Or, I'd like to enter <<repair of transmission of "1957 Ford Fairlane">> and get few if any useless advertisements, and something much better than all articles containing the words "repair" "transmission" and "1957 Ford Fairlane" -- e.g., *not* an article on roof repair that happened to mention that "My manual transmission Toyota truck rear-ended a 1957 Ford Fairlane".
It seems to me mere implementation of a few useful connectives like "of", and maybe the recognition of an adjective-noun phrase, and some heuristics like expanding words to *OR*ed lists of synonyms (ruler ==> (president OR king OR dictator ...)) would yield quite an improvement over the search engines I'm familiar with.
This level of simple grammatical understanding is orders of magnitude simpler than the global analysis and knowledge of unlimited sets of information sources, such as a general purpose question answerer would require.
I'd like to know if anyone else finds this interesting, or knows of any leads for exploring anything related to these possibilities.
By the way, when I entered "rulers of russia" into Wolfram-Alpha, the answer was still Putin, with brief mention of others going back to 1993, so "Russia" seems to be implicitly defined as the entity that has existed since 1993, and there is an attempt at making it an *answer to the (assumed) question* rather than a good list of articles that could shed light on various reasonable interpretations of the phrase.