papetoast

Year 3 Computer Science student

find me anywhere in linktr.ee/papetoast

Wiki Contributions

Comments

Sorted by

You can still nominate posts until Dec 14th?

Thought about community summaries a very little bit too, with the current LW UI, I envision that the most likely way to achieve this is to

  1. Write a distillation comment instead of post
  2. Quote the first sentence of the sequences post so that it could show up on the side at the top
  3. Wait for the LW team to make this setting persistent so people can choose Show All

There is also the issue of things only being partially orderable.

When I was recently celebrating something, I was asked to share my favorite memory. I realized I didn't have one. Then (since I have been studying Naive Set Theory a LOT), I got tetris-effected and as soon as I heard the words "I don't have a favorite" come out of my mouth, I realized that favorite memories (and in fact favorite lots of other things) are partially ordered sets. Some elements are strictly better than others but not all elements are comparable (in other words, the set of all memories ordered by favorite does not have a single maximal element). This gives me a nice framing to think about favorites in the future and shows that I'm generalizing what I'm learning by studying math which is also nice!

- Jacob G-W in his shortform

It is hard to see, changed to n.

papetoast233

In my life I have never seen a good one-paragraph explanation of backpropagation so I wrote one.

The most natural algorithms for calculating derivatives are done by going through the expression syntax tree[1]. There are two ends in the tree; starting the algorithm from the two ends corresponds to two good derivative algorithms, which are called forward propagation (starting from input variables) and backward propagation respectively. In both algorithms, calculating the derivative of one output variable  with respect to one input variable  actually creates a lot of intermediate artifacts. In the case of forward propagation, these artifacts means you get  for ~free, and in backward propagation you get  for ~free. Backpropagation is used in machine learning because usually there is only one output variable (the loss, a number representing difference between model prediction and reality) but a lot of input variables (parameters; in the scale of millions to billions).

This blogpost has the clearest explanation. Credits for the image too.

https://colah.github.io/posts/2015-08-Backprop/
  1. ^

    or maybe a directed acyclic graph for multivariable vector-valued functions like f(x,y)=(2x+y, y-x)

Strongly agreed. Content creators seem to get around this by creating multiple accounts for different purposes, but this is difficult to maintain for most people.

I rarely see them show awareness of the possibility that selection bias has created the effect they're describing.

In my experience with people I encounter, this is not true ;)

Joe Rogero: Buying something more valuable with something less valuable should never feel like a terrible deal. If it does, something is wrong.

clone of saturn: It's completely normal to feel terrible about being forced to choose only one of two things you value very highly.

https://www.lesswrong.com/posts/dRTj2q4n8nmv46Xok/cost-not-sacrifice?commentId=zQPw7tnLzDysRcdQv

Load More