Skip to content

arrrrrmin/lanz-mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LanzMining

Get data

Since scraping is increasingly disliked and excluded from the most terms of use, I recommend a less automated process. Another aspect is the new ZDF-Webstreaming application, which requires to be familiar with selenium or other webdriver-tooling.

I'd recommend using Obsidian Web Clipper together with an AI providers API endpoint. You can also host you'r own open source model and use you'r own endpoints.

You can use my configs and customize for you'r own vault or application.

Install

Process data

To finally obtain you'r dataset you can use the lanzmining processors like so: python src/main.py -c config/vault.json -o output.csv If you want to change the parsing processes, you can create a snapshot of the raw data: python src/main.py -c config/vault.json -o output.csv -snapshot-file snapshots/snapshot.csv, for later usage. Backup you vault, before applying the obsidian clipper templates. To merge a snapshot with a new vault created from updated templates: python src/main.py -c config/vault.json -o output.csv -merge-file snapshots/snapshot.csv.

Visualizations

All visualizations for the talk are build with d3js, so I decided to wrap it in a svelte project. If you'r not familiar with it, all vis code is build in simple js. You can find it at visuals/src/lib/visualisations.

About

A data project to explore media participation in German talk shows of the public broadcasting media.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published