VST History of Video Capture and Editing Timeline*

Start Year End Year DB Start DB End Video Capture Editing Long-form Description
2012 2013 DB6 DB7 None Twitch highlighting During the first 2 years of the VST the video editing was done via the Twitch highlight system.

Discovered Issues:
|- Editing via Twitch highlight was extremely problematic due to the nature of the run (VERY long videos)
|- Twitch's highlight system (at the time) was VERY poor at precise seeking and cutting in long videos.
2014 2014 DB8 DB8 Wubloader (v1) YouTube Video Capture:
Due to the overzealous VOD muting that Twitch added on 2014-08-06 dave_random wrote the first wubloader to capture the video live during the event.
The first Wubloader used:
Livestreamer to capture the stream live to UTC named .mp4 files,
FFMPEG to edit the requested videos,
ImageMagick for thumbnails, and
PHP to run everything,
YouTube API to upload videos, and
GoogleDocs API to interact with the private VST gDocs sheet (for when to cut and upload a requested video).
A video would be trimmed with 2 minutes on either side of the times in the spreadsheet and uploaded as unlisted to YouTube.

Video Editing:
Once on YouTube, a VST person editing videos would go in to the youtube editor, and then trim down the video to the desired actual clip length.
After the video finished processing, the VST person editing the video would then set it to public on both YouTube and the VST private sheet.
This would then make the video avaliable on the VST public sheet as well as the YouTube channel.

Once again, the VST found the edge case:
|- You can not have "too many" (we never figured out the exact number) videos in the "processing edits" state on youtube at once.
|-- Once you hit the too many videos "processing edits" you become completely locked out of the YouTube video editor.
|-- This lasted an undefined set length of time past when the last video finishes processing.

Discovered Issues:
|- There was a minor problem with the way the capture worked, that if all the capture nodes missed a stream piece there would be no way to edit it.
|-- This was usually circumvented by a couple VST members also doing a local capture for themselves.
2015 2018 DB9 DB12 Wubloader (v2) Thrimbletrimmer (v1-3) Video Capture:
Due to the new home-grown editing componet (to avoid YouTube editing hell) dave_random had to careate a new Wubloader.
The new Wubloader used:
nodejs for tying all the pieces together,
Livestreamer to capture the stream live to UTC named .mp4 files,
FFMPEG to edit the requested videos,
ImageMagick for thumbnails,
YouTube API to upload videos, and
GoogleDocs API to interact with the private VST gDocs sheet.
The Wubloader project was moved to node.js instead of PHP as we now had to also serve a editing front-end that was tied to gDocs auth as well.
Also it had to accept input from the new Thrimbletrimmer video editor, so videos could be sent to youtube pre-trimmed.
Initial video edits were still done with 2 minutes on either side and down-scaled to 480p, for ease-of editing (as video quality doesn't really matter to editors).

Due to livestreamer going unsuported in 2016, for the DB11(2017) and DB12(2018) runs the capture component had to be switched to streamlink

Video Editing:
Thimbletrimmer (v1-3) was a custom video editing component written and maintained by Master_Gunner.
It is a (relatively) simple video editing interface that allowed accurate trimming of the beginning and end of a video clip.
The editor also allowed fine tuning via milliseconds so videos could be accurately trimmed to start/end on a good audio spot.
Master_Gunner gradually updated the editor over the years to better work with the way the VST edits videos and imporve funcationality.

Discovered Issues:
|- The minor problems with the way the capture was done remained, but it was a rare occurence that it was a major problem.
|-- This is only really an issue with hard stream drops, and the nodes not instantly picking back up when the stream came back.
|- The only problem with the way editing was done is not with the editor itself, but with that the actual cutting of the video.
|-- Video cutting had to be done on the same node the editing was done on, this was because there was no inter-node communication.
|-- This could lead to a node becoming overloaded with edits, and thus having to re-do some edits as that node's editing service had to be restarted.
2019 20?? DB13 DB?? Wubloader (v3) Thrimbletrimmer (v4) Video Capture:
Due to a new Twitch "feature" discovered during DB12 (2018-11-10) where they changed the way authentication worked with their HLS servers to force a 24-hour refresh,
causing ALL of the captures the VST were making to miss a poster moment, ekimekim decided to take on the task of making a new Wubloader.

The new Wubloader uses:
A completely self written (in python 2.x) HLS capture bot by ekimekim and Chrusher,
Which uses:
gevent for process concurrency on nodes,
PostgreSQL for a backend database,
Flask for writing the interaction API,
nginx for serving HLS segments, stats, and the API,
Prometheus for monitoring the servers,
Grafana for dashboards,
FFMPEG to edit the requested videos,
Docker for deployment of nodes,
YouTube API to upload videos,
GoogleDocs API to interact with the private VST gDocs sheet.

The new Wubloader is different in that it was designed to be a distributed modular system for capturing the stream from Twitch and editing clips from it.
This means that we can have a number of capture nodes that can all get segments from each other in case of stream drops or network problems.
With the modular design, editing and cutting videos are now independent processes that can be done by any editing or cutting node, respectively.
Lines from the VST private sheet are also stored in a database (with UUID identifiers) so only a single node has to poll the spreadsheet vs all of them.
Thrimshim is also an API component written to interact between Thimbletrimmer and the Wubloader.
As we are captruing the raw HLS stream segments in multiple resolutions, this means we just re-stream those segments for editing and don't re-encode anything.

Best Practices:
1) At least 2 nodes that keep the full capture.
2) The node that runs the database/monitoring/sheet-sync is on good physical hardware and internet.
3) Editing nodes can run on any setup, but cutting nodes should be on physical hardware.
4) Geographically different nodes to minimize possibility for segment loss / ad interruption.

DB13 Actual Server Setup:
(multiple)Wubloader capture nodes were setup that start constantly polling the stream to ask for new segments.
|- These nodes were located in geographicaly distinct locations to correctly avoid ad segments.
|- Some nodes were partial capture nodes that only kept limited time window of data and act as backup for other nodes.
|-- These were nodes with limited disk space.
|-- These nodes were deployed as a "in case all other nodes miss a segment".
(On each capture node) a backfiller and restreamer were running to sync missed segments to and from all other nodes.
(On the "best" full capture node) The database, sheet-sync, and monitoring components were setup.
|- "best" here means: Best physical hardware and internet connection.
(On 2 "good" full capture nodes) Thimbletrimmer (video editor), Thrimshim (API), and cutter were setup.
|- "good" here means they are on physical hardware (not Virtual Machines).

Video Editing:
Thimbletrimmer (v4) is a custom video editing component developed by Master_Gunner with some modifications written by ekimekim.
Due to moving to storing HLS segments, the video editor had to be re-written, it now uses:
Video.js a HTML5 Video Player that can play a HLS playlist,
Custom Javascript interaction with Thrimshim (wubloader API) for getting segments and sending information back, and
Custom Javascript and HTML to make the player better.

It is a (relatively) simple video editing interface that allows accurate trimming of the beginning and end of a video clip.
The editor also allows fine tuneing on a frame-by-frame basis so videos could be accurately timmmed to start/end on a good audio spot.

Discovered issues:
|- During the DB13 run twitch disabled all 3rd party API keys for viewing streams, and we were forced to switch to using Twitch's own API key.
|- Very rarely one of the 2-second segments will not have any audio, this has been verified to be a twitch problem.
|-- Detection methods for the no-audio segments is being investigated.

(For reference, to be deleted assuming everyone is OK with description changes)
The Wubloader is a distributed, modular system for capturing the Desert Bus stream from Twitch and for editing and uploading highlights of the stream. The Wubloader consists of a number of nodes that each download the stream and copy any parts of the stream they do not have from the other nodes. The Wubloader copies events from the Sheet into its database. An editor can then edit a video from any event in the database using the in-browser Thrimblethrimmer editor. Once an edit is saved to the database, one of the Wubloader nodes can cut the video and upload it.
A Twitch stream is made up of a playlist of short segments of video which are usually 2 seconds long. The downloader component constantly asks Twitch for the latest playlist, downloads these segments and saves them to disc. The restreamer component when queried provides a list of what segments are available locally and can generate playlists to stream the segments it has. The backfiller component regularly queries the restreamers on other nodes to request a list of what segments that node has. The backfiller then downloads any segments it does not have ensuring that all segments are available on all nodes. The segment_coverage component regularly checks for holes and duplicate segments among the segments on a node.

The sheetsync component sits between the VST G-Docs sheet updated by the VST during run and a Postgres database. It reads input from the sheet to the database and writes the results of edits and cuts from the database to the sheet. Videos are edited in the browser based Thrimblethrimmer. Thrimblethrimmer also acts to playback streams generated by a restreamer and provides the option to download chunks of the stream as a single file. The thrimshim component handles the communication between Thrimblethrimmer and the database, passing the description and rough start and end times originally from the sheet to the editor and passing the start and end times of the video to be cut as well as its title to the database. The cutter checks the database for videos to be cut, and based on the information in the database, combines and cuts the relevant segments into a single file before uploading the video to a hosting location (usually YouTube).

The complete Wubloader system consists of two or more editing nodes as well as a number of replication nodes. Each replication node consists of a downloader, a restreamer, a backfiller and a segment_coverage components. An edit node also contains the same components as well as a thrimshim and a cutter. One of the edit nodes also hosts the database and the sheetsync while a backup of the database is maintained on a standby node. In front of all of the components on a node, a nginx webserver handles communication between the components and the outside world as well as serving the segments stored locally. Each component runs in its own Docker container and each node is managed using docker-compose. Most components (other than the database and Thrimbletrimmer which is mostly Javascript) are written in Python.

Further details and the code itself can be found on the Wubloader GitHub page. This current version of the Wubloader was designed by ekimekim and developed by ekimekim, Chrusher and MasterGunner.