WSJ Data Transparency Code-a-Thon
For the Scanning track
Problem it was solving:
- Checks a website before you visit it. The problem is that most of the privacy tools analyze cookies/scripts that have already been set, so you're being tracked already. This lets you scope out the site beforehand by using my server as a proxy.
- Gives you a list of external files
- Informs you about any cookies its trying to set
- Tells you about any sensitive material on the site - hate speech, adult content, politically sensitive content
Future features that I will implement:
- An opt-in scoreboard of URLs that it has crawled, that is sortable by how dodgey the site seems. I.e. how many weird external links it contains.
What is significant?
- It shows a graphical breakdown of internal and external files / elements
- Anyone who is in a place where their connection is being monitored and they might want to check that the site they are visiting is Safe For Work. Or just generally paranoid people.
What is your sustainability model?
- I'll keep it running myself and if its popular, maybe ask for donations. Sites could also submit (for a fee) their data or explanations of their external links in more depth to prove how safe they are.
What license is it available under?
- Completely free to use, code on request (also free, its just not yet on github :P)
URL to running code:
Email: email@example.com :)