Tag Archives: corrupt

How to datamosh videos with data corruption


Glitching videos with data corruption can be a tricky process. This is owing to the fact that video formats are substantially more complex than image formats. Since video formats contain audio and timing information in addition to visual information corrupting the wrong section of a video file can quickly render the file unplayable rather than delightfully distorted.

This tutorial will focus on glitching the popular MP4 and MOV formats containing video compressed with the H.264 standard. If the video you want to destroy is not in MP4 or MOV format already an easy way to convert it is to upload it to YouTube, let them convert it, and then download the result. If a video doesn’t set off any copyright claims you can download it from the YouTube Video Manager in MP4 format.

Get started by making a copy of your MOV or MP4 and open the copy in a hex editor — never edit the original file. If you don’t have a hex editor installed there are some freeware options listed at the bottom of this post. Hex editors allow us to view and edit the bytes of a file using hexadecimal. Editing the file using hex rather than text allows greater flexibility since we’re no longer restricted to text characters (which are each represented by two hex digits).

We know that we’re looking at an MP4 or MOV file when we see the text ftypqt starting on the fifth byte of the file as illustrated in the example below.

hex-ftypqt

The MP4 and MOV (Quicktime) formats utilize a similar structure, the file is broken down into atoms or blocks of data. The atom which contains the raw frame and audio data can be identified by it’s atom type string, which in this case is mdat (short for media data).

mov mp4 hex mdat

The data contained within the mdat atom is comprised of chunks, which are comprised of nal units, which are comprised of slices. For the purposes of this tutorial we won’t delve that deep. Finding the mdat atom can be done by searching the file for the string “mdat” as seen in the above image. Notice that the contents of the mdat atom seem quite random when viewed in a hex editor, in contrast the contents of the other atoms in MP4 and MOV files are very structured. With this information we can easily find the end of the mdat atom by scrolling through it (or up from the bottom of the file) to see where the data starts to look random as illustrated in the image below.

hex-moov

Here we can see that the mdat atom is followed by the moov atom (and an mvhd atom after that), this is not always the case as the order of the atoms can be different. What’s important to note though is that the file is visibly more structured after the mdat atom, this is how we can identify where the mdat atom ends. The mdat atom, in all cases, will either continue to the end of the file or it will be followed by another atom identified by a 4-character string such as the moov atom does in the above example. Using this method we can identify both the start and the end of the mdat atom, and in turn where we can corrupt only the mdat atom’s contents and have a fair chance of the video still being playable.

Once we’ve identified the boundaries of mdat atom we can begin to copy and paste, replace or edit portions of the raw hex data (or the text ASCII data, either will work) of the video and check the result along the way by attempting to play the video. Making backup copies after every successful change will avoid heartaches when, not if, a misstep renders the video unplayable.

Some notes on successful corrupting:

  • It doesn’t take much corruption to add bizarre distortion to a video, even corrupting as little as 10% of a file, a couple of bytes here and there, has the potential to send playback into a wild frenzy.
  • While not required, most data in the mdat will be in sequence so if we want to corrupt a specific portion of the video we can estimate the offset of the data for that portion is in the mdat based on its time.
  • The mdat atom will also contain raw audio data so if the audio becomes distorted during playback we know we’ve gone too far, or started too early, in the mdat atom.
  • Copying and pasting hex within nal units is probably the best way to corrupt H.264 video data as you’ll be shuffling valid data around rather than adding outright gibberish.
  • As mentioned previously the mdat is comprised of chunks, as chunks and the nal units contained within them have structure it’s best to corrupt small portions of data in various spots rather than large swaths. Corrupting large regions of data will inevitably cross over structure boundaries and destroy important information regarding the type of nal unit or slice.
  • To stay inside nal units look for hex 67, 68 or 00 00 01 as these sequences can denote the beginning of a new nal unit.
  • This definitely falls under the bull in a china shop category of datamoshing video so don’t be discouraged if it takes a couple of restarts to get a playable result.

The video included in this post was glitched using this technique, however the audio was slowed down using traditional video editing.

As with any glitch-by-corruption technique, it requires a soft touch — too little has no effect, too much can destroy the file, but just enough results in glorious, glorious corruption. #corruptabsolutely

Windows

OSX

How to glitch JPG images with data corruption

Glitching JPG (or JPEG) images by corrupting their data is a relatively straight-forward affair. Simply open up the file in a hex editor and wreck up the place. Corrupted JPG images can be identified by the telltale offset horizontal bands of changing hues and small square sequences of artifact patterns at the beginning or end of these bands.

If you don’t have the time or inclination to manually glitch JPG images you can use dedicated apps such as Glitch for iOS.

glitch4ios

If you don’t have a hex editor installed there are some freeware options list at the bottom of this post. Hex editors allow us to view and edit the bytes of a file using hexadecimal. Editing the file using hex rather than text allows greater flexibility since we’re no longer restricted to text characters (which are each represented by two hex digits). Most hex editors display both hex and text (also known as ASCII) in the same view but in separate columns. You can see a JPG open in a hex editor in the image below.

hex-exif

Get started by making a copy of a JPG and open the copy in a hex editor — never edit the original file. The first bytes of a JPG file contain what’s know as the file header. The header contains information that is required for the image to be displayed at all and should be left intact (though feel free to experiment). We need to locate the meat of the file, the raw image data, we can usually tell the raw data apart from the header and other important structural data by it’s garbled nature.

hex-jpg

We can see where some data ripe for glitching is in the above example where the file changes from structured, to seemingly random data. Not to say that there’s no structure, it’s just harder to discern in compressed image data. Once we’ve identified the raw image data we can copy/paste chunks, search/replace sequences or just manually corrupt the data by changing the text (on the right side in the example) or the hex (on the left side in the example) of the raw data.

Now we can begin the process of making changes and checking the result in our favorite image viewer. Making backup copies after every successful change will avoid heartaches when, not if, a misstep renders the image unviewable.

artifacts

Corrupting JPG images often results in interesting patterns due to the corrupt data and the compression algorithms used, as seen enlarged in the example above. Decreasing the quality of the JPG itself, which can be done with image editing software, can sometimes increase the likelihood of generating these artifacts through corruption.

This method can also be used to glitch some other formats as well, most notably BMP files.

As with any glitch-by-corruption technique, too little has no effect, too much can destroy the file, but just enough results in glorious, glorious corruption. #corruptabsolutely

iOS (iPhone, iPad)

Windows Phone

Windows

OSX