All of MoritzG's Comments + Replies

Examples: There is no evidence for the existence of human races. There is no evidence for biological differences between the sexes.

I will trust a person that states: "I think the sky is brown." more than a person that counters with: "The sky is certainly not brown, it is gray, because we have not seen it."

Yes, and that is in the links I provided under "Resources". It is a perfect falsification, sadly I distrust the finding because while finding it I also found that teens and older do get misdiagnosed because the symptoms and consequences are similar. Again, I did write "misdiagnosed for another".

Oh, sorry.

I think that other people think: BPD and ASD are independent, orthogonal mental afflictions. The autism spectrum reaches from zero to +100%.

But I think they share a remarkable resemblance in that they are the opposites in deviation from the norm. The autism emotionally unstable spectrum reaches from "emotionally unstable" to "emotionally unresponsive". Or -100% to +100%

Better? (This editor is impossible for me.)

3Elizabeth
Yes, that's helpful, thank you. I don't agree though. I think characterizing autistic people as "emotionally unresponsive" is incorrect, and that they share with borderlines the trait of... ~when sufficiently stressed, only holding one emotion at a time, and holding that one very strongly. Things get very black and white and very overwhelming.  Sources: autists and BPDs I have known, every discussion of BPD, Temple Grandin on her autism.
5lsusr
An alternative theory considers autism and schizophrenia to be opposites.

BPD - ASD

socially hyper-sensitive - socially hypo-sensitive

hyper-emotional - hypo-emotional

hyper-neurotic - hypo-neurotic

interest in people - interest in things

focus on feelings - focus on facts

mostly females - mostly males

wants to express subjective feelings - wants to arrange, order world to objective criteria (What does the person want to bring to the outside?)

Overall I see an over emphasis on the observable empirical disabilities, consequences and a lack of understanding the motivation, mediating cause. Because psychology depends on self-reporting, there is a strong perception bias that is not accounted for but answers are reported as fact.

Other non-primate animal doing similar: Alex the gray parrot https://en.wikipedia.org/wiki/Alex_(parrot) He also made his own terms as "cork nut" for almond.

I don't get the joke. Who is "Rick Astley"?

9gbear605
https://knowyourmeme.com/memes/rickroll

yes, but I am posing the WHY question. In this case it is just an averaging effect not a feedback controller.

"parabola" That would be a third category then: No correlation observed because the aggregated observation cancels out the effect working both ways.

1Adam Bull
No, it's just causation without correlation; correlation is defined to be the aggregate effect.

"freedom to self-alter its own error function"

How? By changing the function alone or by changing the input to that function?

2lsusr
By tuning the function's parameters that define success. In your words, "[b]y changing the function alone".

"Styling" I will (and can) make the edits.

"parody" I call it a polemic analogy.

"seems, if not in conflict with" I think you noticed that there is no contradiction, but I agree that I need to clarify. Faced with a massive lack of information and the task to predict the future it is clear that it would be pure luck to make the best decision. Operating with that mindset might even be hindering.

" I must seek C*(A+B) at a lower cost." I was trying to get into what to choose / look for in a finite set with competition. A B C ... are terms of criteria that I esti

... (read more)

Thank you for commenting.

"is offputting enough" That would be a sensibility of yours and not a rational argument.

"implication that young women are not competent, and the generalization overall, and the unstated implication that HR has anywhere near the power that you ascribe to it" I made no such statements. "Many" is not the same as "all". I include employees of headhunting companies as HR workers and these do have power when it comes to early screening including the assessment of qualification. I had plenty such talks where I could not even make the ot

... (read more)

I lost the formatting when I pasted the text. I managed to switch to the Markdown interpretation and bring the list back.

I came across this:

The New Dawn of AI: Federated Learning

" This [edge] update is then averaged with other user updates to improve the shared model."

I do not know how that is meant but when I hear the word "average" my alarms always sound.

Instead of a shared NN each device should get multiple slightly different NNs/weights and report back which set was worst/unfit and which best/fittest.
Each set/model is a hypothesis and the test in the world is a evolutionary/democratic falsification.
Those mutants who fail to satisfy the most customers are dropped.

2lsusr
NNs are a big data approach, tuned by gradient descent. Because NNs are a big data approach, every update is necessarily small (in the mathematical sense of first-order approximations). When updates are small like this, averaging is fine. Especially considering how most neural networks use sigmoid activation functions. While this averaging approach can't solve small data problems, it is perfectly suitable to today's NN applications where things tend to be well-contained, without fat tails. This approach works fine within the traditional problem domain of neural networks.

When it comes to intelligence, rationality, depression, autism the evolutionary selection aspect is interesting, because we all know that the mentioned mental properties are lowering your chances to raise many children today.

https://www.google.com/search?q=%22falling+off+the+cliff%22+evolution+OR+selection+autism+OR+depression

https://www.psypost.org/2017/02/study-suggests-autism-risk-genes-favored-natural-selection-47876

https://evolution-institute.org/the-darwinian-causes-of-mental-illness/

Too much good quickly turns bad.

As we know and you mentioned, humans do learn from small data. We start with priors that are hopefully not too strong and go through the known processes of scientific discovery. NN do not have that meta process or any introspection (yet).

"You cannot solve this problem by minimizing your error over historical data. Insofar as big data minimizes an algorithm's error over historical results ... Big data compensates for weak priors by minimizing an algorithm's error over historical results. Insofar as this is true, big data cannot reason about... (read more)

3lsusr
This is very important. I plan to follow up with another post about the necessary role of hypothesis amplification in AGI. Edit: Done.
1MoritzG
I came across this: The New Dawn of AI: Federated Learning " This [edge] update is then averaged with other user updates to improve the shared model." I do not know how that is meant but when I hear the word "average" my alarms always sound. Instead of a shared NN each device should get multiple slightly different NNs/weights and report back which set was worst/unfit and which best/fittest. Each set/model is a hypothesis and the test in the world is a evolutionary/democratic falsification. Those mutants who fail to satisfy the most customers are dropped.

Why did you not go for engineering (like me)? Still some math proves but no one listens and they will not test it either.

Twice I made the mistake to ask 'why' it is the way it is. All I got was "look at the prove, it works out". That is why have have little respect for mathematicians i.g..

Because the SIMD approach is bad for 2D on 2D matrix multiplication NVIDIA has introduced:

Tensor Cores in the Volta architecture.

Article about it:

https://www.anandtech.com/show/12673/titan-v-deep-learning-deep-dive/3

Or when you attended a gathering/event then read about it and think that that was an entirely different event.

"newspaper from ten years ago, ... what happened in the topic afterwards and then judge how informative the article was"

As a German you will know the saying: "Nichts ist so alt wie die Zeitung von gestern." -> "Nothing is as old as yesterdays paper."

N.N. Taleb calls it noise.

At fist it is either wrong or without consequence or propaganda, then it is outdated. A historian will find 99% of all "news" to be little more as an "interesting time piece" at best representative for the thinking and style of the era.

3ChristianKl
The problem isn't just one of propaganda whereby certain interests get pushed. It's also one of general lack of deep insight of the reporter in the subject they are reporting on and their necessity to simplify matters. The argument is that many people make the experience that if they encounter a newspaper article about a subject where they have domain knowledge they discover that the article is full of mistakes. If you then generalize that observation over the whole newspaper it leads to the conclusion that the paper isn't better then sawdust. The excercise for the reader would be to go to the average science section of a prestigious newspaper from ten years ago, look at the study based on which the article is based, on what happened in the topic afterwards and then judge how informative the article was. Then the next step is to ask yourself what it would mean if that quality level would generalize over the whole newspaper.

There is a non zero risk of death. There have been cases where some candidates had severe permanent damage and it took a while to figure out why. We are all slightly different.

Even with animal trials there is a risk. I do not think there is an answer to your question.

1Yoav Ravid
I think an answer is mainly how "fat" the tail is, which you addressed. i am wondering though, how much of the risks to animals in animal trials apply to humans. not because of difference in biology, but whether we can know how much we don't know about a vaccine before we give it, and only give the least uncertain to humans (cause i assume we're being less careful on animals). i guess we can't know much and my prior on testing vaccines was "dangerous on average, with really fat tail"

Oh, I should have mentioned that this is also assuming that there is a constant factor between the total number of infected in the past to the number of currently infectious. Which is true as long as the spread is exponential, but that is the entire assumption anyhow.

In Germany the data is ATM consistent with:

+22% infected per day which is exactly +3 people infected after one week after the infection by every infected. (This is assuming that there no imported cases, 7th root of ((1+3)/1))

1MoritzG
Oh, I should have mentioned that this is also assuming that there is a constant factor between the total number of infected in the past to the number of currently infectious. Which is true as long as the spread is exponential, but that is the entire assumption anyhow.

"only a minority of people use a bicycle for anything other than recreation"

I guess my upbringing and surrounding is special (densely populated area in north Europe), but I know plenty of people who move in no other way (shopping, vacation, commute, everything). Before the gasoline motor scooter became wide spread, and poisoned the air in Asia, people used bikes all the time.

Please elaborate on why you think the bicycle "merely offer convenience or entertainment". I understand that people did not understand it's potential and thought of it as a toy for crazy people, but wasn't the same true for the gasoline-automobile? To me the bicycle is of great importance, not just/only for leisure, sport. I understand that the bicycle's value depends on the distance, flatness, wind, road quality traveled, but compared to a horse (that most did not have) it is so much better.

3jasoncrawford
Good point. It did evolve into more than just a convenience for many people. In the beginning, though, it was seen as a leisure activity with no real practical value. And even today its economic and social impact is not as great as, say, textile mechanization. Almost everyone on Earth wears mass-manufactured clothes; only a minority of people use a bicycle for anything other than recreation.

The entire SIMD vector approach is good for many dot products but it is not the same as a systolic array for rank two on rank two multiplication.

If the job would be to multiply two 1024x1024 matrices then a systolic array of 256x256 MACs would be a good choice. It would work four times on 256x1024 by 1024x256 matrices for 1024+256 steps.

1MoritzG
Because the SIMD approach is bad for 2D on 2D matrix multiplication NVIDIA has introduced: Tensor Cores in the Volta architecture. Article about it: https://www.anandtech.com/show/12673/titan-v-deep-learning-deep-dive/3

To me there is a difference between the hardware for 1xN by Nx1 and MxN by NxM (with N > M > 1). Although any matrix operation is many 1xN by Nx1 dot products, doing them independently would be inefficient.
"If you do a matrix multiplication the obvious way, this results in dot products of rows and columns (one for each element of the resulting matrix). So it seems to me that improving matrix to matrix multiplication performance comes from improving the performance of dot products."
True, but not individual dot products, but the collective of very many dot products. Obviously you do not do it the obvious way as you would have to load the same data over and over again.

An example of a systolic algorithm might be designed for matrix multiplication. One matrix is fed in a row at a time from the top of the array and is passed down the array, the other matrix is fed in a column at a time from the left hand side of the array and passes from left to right. Dummy values are then passed in until each processor has seen one whole row and one whole column. At this point, the result of the multiplication is stored in the array and can now be output a row or a column at a time, flowing down or across the array.
https://en.wikipedia.org/wiki/Systolic_array

True, I had not claimed that all criteria could or have been met. Because of the noise and the heat I just the other day replaced the inductive load in some of my very old but still fully functioning kitchen counter lights, with modern switching current regulators. The 50 Hz produce a 100 Hz tone that had been bothering me for decades. But even some of those can be heard by some people. (Not me I am deaf to anything >10kHz)

It is a compromise in an area of sensory overlap but the human senses are not equally sensitive to all frequencies. Your hearing is way better at 3kHz. At your age you will still remember CRT monitors that would operate at 60 Hz at max resolution, bad but they did get used.

Why 50/60Hz? It has to be too low to be heard, to high to be seen, high enough for transformation, low enough for low induction losses, low enough for simple rotating machines. Trains can not use 50/60 so they went with 1/3 (16+2/3 Hz or 20 Hz)
Grid frequency is controlled to +-150mHz if that fails private customers might get disconnected/dropped.
The time derivative of the grid frequency is a measure of the relative power mismatch.

1Robert Miles
50-60Hz is not too low to be heard: https://www.youtube.com/watch?v=bslHKEh7oZk It's not really too high to be seen either, lights that flicker at mains frequency can be pretty unpleasant on the eyes, and give some people headaches.

I observe a radicalization that is driven by what I call the concept of "counter crazy". It let to Trump but I have been aware of it for longer on the left. The idea is that by being more radical in the way, that you think the world needs to be, you could achieve that. It is compounded by the tribalism and identity culture.

The idea of "you can not speak on this because you are not a woman / ..." is recent to me. But has been expressed by the most intelligent female I know. It is a scary idea.

The idea of cushioning life is a g... (read more)

I found this recent Dilbert cartoon to be a good summery of the issue with being smart in a complex random world:

https://dilbert.com/strip/2020-01-25

The way you commented it is not clear what you are referring to. I did not understand your comment because I did not get "where you were coming from".

1jmh
First, was Spock rational or just wanted to think himself rational. I am not completely sure that was the underlying character trait of Vulcan's in the show -- though also agree that much can support it. Seems like their history was that of excessive passion, apparently to an uncontrollable and very destructive level. Their solution seems to have been to suppress their emotions, and so the passion, which then left the purely intellectual response to the external world and their own thinking/decisions. Since I don't see emotion and rationality as either opposites or necessarily antagonistic to one another I wonder if considering rationality through a third lens -- epistemic, instrumental and emotional might help lead to some better decision-making than placing them in opposition. Principle #4 gets at this with the diagrams showing them as opposing but the argument questioning that approach. (I actually missed on this bit in my first comment.)

Straw_Vulcan is an example of an attack of two of the three types of thinkers on another.

The moral-thinkers try to show their superiority. In Star Trek this is ever present. In all the stories morality and principles always win over rational compromise. The captains usually favor the best possible short term outcome over risk minimization and the long term. As it is fiction this always works out.

The three thinking types as formalized/categorized (to my knowledge) by Rao Venkatesh of ribbonfarm.

https://fs.blog/venkate... (read more)

Thank you, I should have thought of it in that (Time complexity) context. Time complexity is not just about how long it takes but also about the nature of the problem. Chess is neither P nor NP, but the question of complexity is certainly related.

Maybe my question is: Why can there be a Heuristic that does fairly well and is nowhere near exponential? Even a count of the pieces left on the board usually says something that only a full search can prove.

Then you are wrong because since the search usually does not reach the chess mate state, there is always a scoring heuristic replacing the further exploration search at some dept.

I know and had read chessprogramming prior to your post, you are wrong to assume that I am a total idiot just because I got myself confused.

2Dagon
Didn't mean to condescend, I was mostly pointing out that complexity is in the iteration of simple rules with a fairly wide branching. I will still argue that all the heuristics and evaluation mechanisms used by standard engines are effectively search predictions, useful only because the full search is infeasible, and because the full search results have not been memoized (in the not-so-giant-by-today's-standards lookup table of position->value).

Ok, let's go with chess. For that game there is an optimal balance between the tree search and the evaluation function. The search is exploratory. The evaluation is a score.

The evaluation can obviously predict the search to some degree. Humans are very bad at searching, still some can win against computers.

The search is decompressing the information to something more easily evaluated by a computer. A human can do it with much less expansion. Just a matter of Hardware or is it because the information was there all along and just needed a "smarter" analysis?

2Dagon
I may not have been clear enough. The evaluation _IS_ a search. The value of a position is exactly the value of a min-max adversarial search to a leaf (game end). Compression and caching and prediction are ways to work around the fact that we don't actually have the lookup table available.

This reminds me of another issue. If you do make informed complicated decisions, the basis of these decisions might change over time. I struggle with that problem professionally. As an engineer I have to make complicated compromises/decisions. The trouble is that the situation changes all the time. The requirements and the means change. Without tracking why I made decisions there is no way to tell if those decisions still hold, because I do not even remember myself. The project becomes a zombie even before there are true legacy and hand-over issues. Usuall... (read more)

In reality there are smart penguins and dumb penguins and penguin news papers. The professional penguins will tell other penguins how great it has been going so they can get out before the ledge breaks of and they all fall into the water.

To realize those booked earnings you have to sell without causing the crash, so you have to setup potential buyers first. That is why I consider articles about investing into something in major papers the last warning before the crash. When I read that the only smart thing to do, is to invest into ... I know not too.

You seem to think that the economy and markets are random without memory or state. You are the one with a fallacy called: "the map is not the territory".

I think Liron only meant the times of growth with those 10%. Looking at the recent stock market you will clearly find growth that is much higher than the long term rate and higher than economy + inflation + "risk free return". In the last 10 years the annual rate was indeed 10.5% pa

There are two issues with it.

You can not figure out how something works by only looking at some aspect. Think of the blind people and elephant story.

But it still has a point because with a subsystem that makes predictions the understanding of a system by pure observation becomes impossible.

Right, one could expand the clause indefinitely, that is kind of what I meant by "can only find what you are looking for". But that only means it is hard, not that it is bad to think that way.

I do neither think of it as logic nor as causal diagrams nor Bayesian nor Markov diagrams but simply as sets of some member type that may have any number of features/properties/attributes that make them a member of some subset.

When I wrote "A AND B" I wanted you to understand it as a dual logic clause, but only for simplicity.

The way I really think... (read more)

2johnswentworth
I think of "gears-level model" and "causal DAG" as usually synonymous. There are some arguable exceptions - e.g. some non-DAG markov models are arguably gears-level - but DAGs are the typical use case. The obvious objection to this idea is "what about feedback loops?", and the answer is "it's still a causal DAG when you expand over time" - and that's exactly what gears-level understanding of a feedback loop requires. Same with undirected markov models: they typically arise from DAG models with some of the nodes unobserved; a gears-level model hypothesizes what those hidden factors are. The hospital example includes both of these: a feedback loop, with some nodes unobserved. But if you expand out the actual gears-level model, distinguishing between different people with different diseases at different times, then it all looks DAG-shaped; the observed data just doesn't include most of those nodes. This generalizes: the physical world is always DAG-shaped, on a fundamental level. Everything else is an abstraction on top of that, and it can always be grounded in DAGs if needed. The advantage of using causal DAGs for our model, even when most of the nodes are not observed, is that it tells us which things need to be included in the AND-clauses and which do not. For instance, "gear AND oval-shaped" vs "gear AND needs oil" - the idea that the second can be ignored "because that already follows from gears" is a fact which derives from DAG structure. For a large model, there's an exponential number of logical clauses which we could form; a DAG gives formal rules for which clauses are relevant to our analysis.

" Does anyone know of any similar experiments that have been run? "

I can say with certainty that it has been done on a small scale, because I once saw a German TV documentary for which they had also gotten a small group (6-10) of both males and females and gotten them tested and then shown them the pictures of the others. Later they also gave them some facts, showed videos and asked them again.

The outcome was that there was a clear correlation between the guesses and the IQ-Test results and it got better the more information people had.

But the p... (read more)

I made up this story:
In a company there have been head injuries, so they brought in a medical student to investigate/research.
The researcher gathered all employees blood pressure, gender, age, and eye sight data.
The result was that mostly men were affected, with all other factors being what you would expect given the employees.
The company was forced by the insurance company to make helmets mandatory for all men due to their gender being a risk factor.
Because the engineers were all men they were over proportionally affected and did not like to wear the helm... (read more)

"This will result in unnecessary stress and misery in your life."

LOL, that is very close to what I told a girl once. You would think it is the most sensitive and reasonable thing to tell a person and a good way to put it. She did not call me names, but was not thankful either.

"cases of autism that are caused entirely or mostly by normal genetics are associated with unusually low IQ (80% confidence) "

Only the research correlating genes and IQ-test results are objective.

All correlations between IQ and DIAGNOSED autism are skewed. People who are smart and have good enough speech skills, and thus are not too affected can hide their level of autism. People who are functional will not be diagnosed.

Lets assume, that autism is not an on/off deal but gradual and that there is a positive correlation with general intelligence, then the statistic will not include people who are below a high level of autism because they compensate.

"autistic people ... generally have very low intelligence. One study ... autistic people had an IQ ..."

Unless you positively define intelligence as measured by some IQ-Test, I oppose that statement.

The entire discussion around intelligence would profit, if people would stop casually equating the two.

One is a test that have seen different ones of and some where out right bad others flawed, the other is a concept that can be described, but is much more often used than understood by the public.

1MoritzG
"cases of autism that are caused entirely or mostly by normal genetics are associated with unusually low IQ (80% confidence) " Only the research correlating genes and IQ-test results are objective. All correlations between IQ and DIAGNOSED autism are skewed. People who are smart and have good enough speech skills, and thus are not too affected can hide their level of autism. People who are functional will not be diagnosed. Lets assume, that autism is not an on/off deal but gradual and that there is a positive correlation with general intelligence, then the statistic will not include people who are below a high level of autism because they compensate.
Load More