Page MenuHomeDevCentral

Analyze Wired
Needs ReviewPublic

Authored by Thibaut120094 on Jun 17 2017, 11:59.

Diff Detail

Repository
rSTG Source templates generator
Lint
No Linters Available
Unit
No Unit Test Coverage
Branch
site/wired
Build Status
Buildable 1555
Build 1803: arc lint + arc unit

Event Timeline

Thibaut120094 created this revision.Jun 17 2017, 11:59

However, it seems that some pages are compressed with gzip and STG won't be able to parse them without decompressing them first.

thib@debian:~$ wget "https://www.wired.com/story/amazon-whole-foods-acquisition-grocery-shopping/" -O output.html    --2017-06-17 12:13:58--  https://www.wired.com/story/amazon-whole-foods-acquisition-grocery-shopping/
Resolving www.wired.com (www.wired.com)... 151.101.1.63, 151.101.129.63, 151.101.65.63, ...
Connecting to www.wired.com (www.wired.com)|151.101.1.63|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 125202 (122K) [text/html]
Saving to: ‘output.html’

output.html                   100%[================================================>] 122.27K  --.-KB/s   in 0.02s

2017-06-17 12:13:58 (6.20 MB/s) - ‘output.html’ saved [125202/125202]

thib@debian:~$ file output.html
output.html: gzip compressed data, from Unix