and that that will come with the need for content moderation
It will certainly come with calls for content moderation, but for all the reasons you allude to, the assertion that there will be a need for such moderation seems quite tendentious.
I agree with @faul_sname that the bash is more readable.
But maybe a better (more readable/maintainable) Python alternative is to explicitly use Amazon's Python API for S3 downloads? I've never used it myself, but googling suggests:
import json
import boto3
from io import BytesIO
import gzip
try:
s3 = boto3.resource('s3')
key='YOUR_FILE_NAME.gz'
obj = s3.Object('YOUR_BUCKET_NAME',key)
n = obj.get()['Body'].read()
gzipfile = BytesIO(n)
gzipfile = gzip.GzipFile(fileobj=gzipfile)
content = gzipfile.read()
print(content)
except Exception as e:
print(e)
raise e
You could wrap that in a function to parallelize the download/decompression of path1
and path2
(using your favorite python parallelization paradigm). But this wouldn't handle piping the decompressed files to cmd
without using temp files...
I'd argue that using argh
is just as easy and strictly better:
$ cat test.py
#!/usr/local/bin/python
import argh
def start(width, depth, height):
print(float(width) * float(depth) * float(height))
if __name__ == '__main__':
p = argh.ArghParser()
p.set_default_command(start)
p.dispatch()
$ ./test.py -h
usage: test.py [-h] width depth height
positional arguments:
width -
depth -
height -
options:
-h, --help show this help message and exit
$ ./test.py 1 2 3
6.0
$ ./test.py 1 2 3 4
usage: test.py [-h] width depth height
test.py: error: unrecognized arguments: 4
$ ./test.py 1 2
usage: test.py [-h] width depth height
test.py: error: the following arguments are required: height
And it's even easier if you are willing to use commands (which is often useful when you want to extend the script to do more than one thing):
$ cat test.py
#!/usr/local/bin/python
import argh
def volume(width, depth, height):
print(float(width) * float(depth) * float(height))
def area(width, height):
print(float(width) * float(height))
if __name__ == '__main__':
argh.dispatch_commands([volume, area])
$ ./test.py -h
usage: test.py [-h] {volume,area} ...
positional arguments:
{volume,area}
volume
area
options:
-h, --help show this help message and exit
$ ./test.py volume -h
usage: test.py volume [-h] width depth height
positional arguments:
width -
depth -
height -
options:
-h, --help show this help message and exit
$ ./test.py volume 1 2 3
6.0
$ ./test.py area 12 24
288.0
AFAICT those fines have not been for missing cookie banners. And if I were Mark Zuckerberg, I might think to myself, "the EU is going to shake us down for 'privacy violations' no matter what we do, so why should I bother making our user experience worse with annoying cookie banners?"
(Also, to some extent, FAANG-scale companies may get fined but serve as a shield for all smaller companies. If you were a Brussels bureaucrat with a focus on fining websites for privacy issues, and you could get hundreds of millions for targeting a FAANG [not that you get to keep any of that money or plausibly tell yourself that your work improved the world in any meaningful way, but hey whatever floats your boat], would you bother fining Joe Startup $100k for imperfect privacy practices in the app they're running from their garage in San Bruno?)
OK, maybe I'm wrong about the politics as regards large multinationals. (Although I'm not sure I'm wrong.)
But that argument says nothing about why a website like JSTOR (non-profit, US-based) complies. I'm skeptical that anyone would try to enforce against them, and also that any such enforcement would have actual legal consequences. EU tries to fine JSTOR, JSTOR says "we are in the US" and doesn't pay, then...? Does anyone actually think the EU is going to force all European ISPs to block JSTOR? I suppose if JSTOR uses EU-based datacenters to serve some content to European users, those could be shut down. I do not think that would be a popular move with European academics.
Why do non-EU-based companies/websites bother to comply with this directive? For that matter, why do even big firms with an EU presence comply? I can see why a firm with an EU office or employees might worry about some legal risk, but (a) is it really true that the EU would devote significant enforcement resources to prosecuting/fining "victimless" violations of this directive? (b) for sufficiently popular websites (Amazon, FB, ...) surely the companies have more leverage than the EU, since I'd think that one of these firms even threatening to stop serving European customers (or employing EU programmers for that matter) would cause vastly more political backlash compared to the amount of genuine political support for the (policy motivation behind the) annoying cookie banners.
Note that in the inexact case (i.e. observation error) this model (the Lasso) fits comfortably in a Bayesian framework. (Double exponential prior on u.) Leon already made this point below and jsteinhardt replied
Do you have any reason to believe these figures are accurate?