This is a very long and involved post that details our work on this episode. Future posts will not be this long, but we felt that some of our fans might be interested to know how we went about encoding this particular episode. If you simply want the torrent or DDL links, then stop here.
Warning: The 720p version has not been tested on the Xbox 360. We have tested it on the Playstation 3. We have full confidence that the 720p video will work on the Xbox 360, but if you find that you are unable to play it on the Xbox 360, please notify us immediately.
Part 1 | Part 2
OK, so we’ve finally gotten around to re-encoding the BD’s for this series. The first time around was a complete disaster. The second time around had weird corruption issues. And now, we’re doing this a third time, except this time we’re gonna get it right.
- Ensure the perception of high quality
- Use anti-banding filters to restore lost gradients in the source
- Improve image encoding and compression by specifying encoding settings for scenes that require special considerations (more on this)
- Use FFMS over directshow. We’ve been using FFMS since The Disappearance of Haruhi Suzumiya (1080p)—which is why there’s corruption in the 720p version
- Prevent all corruption
- Make the torrent tolerable. Target file size: 2.5 gigabytes (~6000 kilobits/second)
- Increase the quality of gradients
- Make DDL’s tolerable. Target file size: 1.3 gigabytes (~3100 kilobits/second)
- Use smoothing filters on flat colored areas to wash out details and improve compression
As you can see, Katanagatari presents a challenging issue for us: how do we encode a 43 minute gradient-heavy, highly detailed, and highly dynamic anime without going over 1.3 gigabytes for the 720p and 2.5 gigabytes for the 1080p and without sacrificing gradients or details? That is an incredibly difficult thing to do. In fact, you could say that this is the very struggle of developing lossy compression techniques. I mean, we’re having issues keeping Ore no Imouto under control, how are we suppose to do that with Katanagatari? x264 is an amazing compression library, but it can’t do magic. So if we can’t use magic, then we better figure out a way to fake it.
One of the biggest issues with Katanagatari is that some scenes are radically different than other scenes. The first two minutes of this episode features heavy amounts of grain and noise. The rest of the episode is fairly steady. Compare the above image with the one below and the one at the top.
We’re going from white text on black, to a super high-grainy image, to complex lines and fine details. It’s kinda difficult to pick a “median” crf value and hope that x264 doesn’t waste tons of bits trying to preserve the grain at the beginning and blow up the file size. By the same token, if we try to limit the file size by using a target bit-rate, x264 is going to want to use a lot of bits at the beginning in order to preserve the grain and will sacrifice the quality of later scenes in order to achieve the target file size.
So the whole issue revolves around tweaking the rate-control method and its associated constraints for specific scenes. There is an option in x264 that allows you to manipulate analysis methods for any arbitrary sets of frames; it’s encapsulated in the –zones switch. But the only thing you’re really allowed to modify is the motion-estimation, the deblocking strength, the subpixel estimation complexity, and the quantizer level. Now, modifying the quantizer would be nice, except we don’t use a constant quantizer to encode our media. We don’t want a context-blind encoding scheme that simply forces everything to be a certain “quality” regardless of the scene context, which is what a constant quantizer does.
Real quick before you get confused, there are three predominate, mutually exclusive rate-control methods in x264:
- Constant Quantizer (QP): Specifies a blind quality level. Basically takes a frame and encodes it according to the specified quantizer level, without taking into account scene context. So basically, let’s just say that I have an extremely dynamic video with lots of still scenes and lots of fast scenes. If I used QP for this type of video, x264 would encode all scenes at the same quality level.
- CRF (Constant Rate Factor): Specifies a certain perceived quality threshold. Takes a frame and calculates how important the details of this frame are in relation to other frames around it. If this frame is a part of a scene that is extremely short or extremely fast (lots of camera movement), then the quality of this scene is reduced because your eyes probably wouldn’t be able to appreciate the details of this scene anyways. The inverse is also true: if a frame is a part of a scene that is still or long, then the quality of that frame is cranked up because your eyes will be viewing this scene for a longer period of time. If you’re CRF value is X, then fast scenes will have a QP value of X + K, while slow scenes will have a QP value of X – J, where X, K, and J are in the set of rational numbers. Remember, lower values for CRF and QP mean higher image quality, with 0 being lossless. Let that sink in for a bit.
- Bit-rate Constrained: This specifies a target bit-rate, and thusly a target file size. This can be used in a single pass (a bad idea) or in two passes. The first pass analyzes the video and determines what type of frames to insert at certain points of the video, while the second pass actually encodes the video. Nothing really to be said here. It’s one of the simplest rate-control methods, yet one of the hardest to properly implement.
So in order to overcome these limitations, we simply split up the encoding job into separate pieces based on the scenes in question. Now, that answer may sound obvious at first, but there isn’t any documentation or technique of concatenating two or more x264 raw bit-streams. So we did some experimenting and found a way to concatenate the raw bit-streams together, but it requires particular attention to the frame-type specified.
Without getting into details and extending this rather long post, I’m just gonna say that some scenes were encoded at lower quality settings because your eyes can’t tell the difference. Specifically, we altered the AQ-Strength, the deblocking strength, the rate-control methods and their associated constraints, and the motion-estimation methods for each scene in question. This is basically how we’ve managed to export a rather reasonable file size.
Of course, given that we were limited to a certain file size, both of our releases are bit-rate starved. The 720p is most definitely the most bit-rate starved, but if you look closely at the 1080p version, you may see some slight banding. But for the most part, both releases should look pretty damn good.
Now, onwards with Ore no Imouto (4).