New updates and improvements to Milk Video.

In the new version of Milk Video, audio and video files are supported and transcribed immediately. When you drag or import files, you will automatically have captions to use in a designed video.

The changes support podcast producers interested in creating video previews for social media.

Automatic transcription save you a step when you’re importing media; rather than confirming that you want to transcribe, we just start transcribing, since our users almost never choose not to transcribe.

It’s now possible to upload, create with, and share your own custom fonts in Milk Video.

We currently support over 1000 free fonts from Google Fonts. Businesses with strict design guidelines require specific fonts. Today, your entire team can share user uploaded fonts and work with them in any video you create.  

You can upload any TrueType, Web Open Font Format, or Open Type Font files. Custom fonts can be used in title layers or word-by-word captions.

Major server-side and client-side performance improvements have been to Milk Video.


Our app dashboard and page navigation has be reworked to minimize downloading external files unless explicitly being displayed.


Our caption editing interface has been reworked to minimize the amount of unnecessary caption content displayed. The improvements will result in transcript corrections in the transcript editor to affect already clipped video sections.


Server-side performance has been dramatically improved to reduce syncing time between client-side changes.


Milk Video’s video rendering pipeline has improved its’ parallel download process. Users will no longer experience long wait times when the site is actively under load.

As a video creator a large part of our work is about exploring constraints and possibilities, which is often a very iterative process.

Most of this work involves repetitive tasks, like “What if the caption line ended slightly earlier?” or “Would it help to change the color contrast?”

This tedious work can take up a huge amount of time. Simply rearranging layer items on a timeline might mean performing 10 minute operations, 9 click and select operations — all while keeping track of remaining scenes in your head.

It shouldn’t have to be this hard. After all, we have these amazingly powerful computers at our disposal, where one of the fundamental aspects is effective repetition and reuse. I.e. a computer is able to perfectly repeat the same action multiple times at blazing speeds, something utilized by writing professionals, software engineers, data researchers and others. But not as much by designers.

Milk Video’s design canvas now supports multiple item selection, grouping and layer changes. When making adjustments to scale, position, color, and fonts, items can now be manipulated in groups.