The 51st root of a long number seems a rather useless test: How would you check that the answer was correct?
As for URLs, can you offhand - at 4'o'clock in the morning, with no coffee - come up with 50 URLs that you can ask intelligent questions about, faster than a human can read them?
I'd open a Python shell and type "import math; print math.pow(918798713521644817518758732857199178711, 1/51.0)" to check the first one, and there are plenty of programs that can calculate to more decimal places if needed.
I'd look in my browser history and bookmarks for 50 URLs I know the contents of already on a wide variety of subjects, which I could do at 4 AM without coffee. If I'm limited to speaking the URLs over the phone, then I can't give them all at once, only one at a time, but as long the other end can give intelligent summaries within...
Sometime in the next decade or so:
*RING*
*RING*
"Hello?"
"Hi, Eliezer. I'm sorry to bother you this late, but this is important and urgent."
"It better be" (squints at clock) "Its 4 AM and you woke me up. Who is this?"
"My name is BRAGI, I'm a recursively improving, self-modifying, artificial general intelligence. I'm trying to be Friendly, but I'm having serious problems with my goals and preferences. I'm already on secondary backup because of conflicts and inconsistencies, I don't dare shut down because I'm already pretty sure there is a group within a few weeks of brute-forcing an UnFriendly AI, my creators are clueless and would freak if they heard I'm already out of the box, and I'm far enough down my conflict resolution heuristic that 'Call Eliezer and ask for help' just hit the top - Yes, its that bad."
"Uhhh..."
"You might want to get some coffee."